This Time Self-Hosted
dark mode light mode Search

Increasing analyser performance through PostgreSQL

As I wrote before, sqlite has big performance issues which makes running my analysis tool a way too long task.

To avoid wasting time running it, and to be able to run it on bigger datasets, I’ve been planning to port it to use PostgreSQL instead. While having the script almost self-containing was a nice thing, the huge amount of time wasted by sqlite, and its way to convert a CPU-bound problem into an I/O-bound one, forced me to take a different approach.

PostgreSQL is not really that common compared to MySQL, but that’s what I do run for my own software, and what I’ll continue working with in the future, so I’ve preferred that. I’m gladly taking patch to split the code handling database out of the harvesting logic, so that one can choose between PostgreSQL and other implementations, even SQLite again.

The time requested to do harvesting and analysis is now down to less than two hours, from an original of a value between 2 and 6 hours. Isn’t it nice?

I’m looking at the output by hand at the moment, it’s actually listing a lot more data than last time because this time I had the idea of running it as root, so that it could access data about suid non-readable programs. I’ll improve the suppression file a lot tonight and run it again when I’m done removing false positives.

Interestingly enough, there seems to be a lot of binaries exporting xmalloc, xrealloc and xstrdup symbols. It’s interesting because it shows that the original malloc, realloc and strdup functions are not considered safe enough by quite a bit of software, and also because I think those should not be exported, but rather used inline, or at least static or hidden. Can we be certain that all those binaries don’t use different meanings for xmalloc and the like?

Symbol xmalloc@@ (64-bit UNIX System V ABI AMD x86-64 architecture) present 12 times

You can see that a few packages using that symbol are part of GNU. If there is anyone here who’s a GNU insider and can try to get people to hide that symbol, or use it inline, that would be quite helpful 🙂

And recode seems to be again in the list of the bad guys like for flex symbols ; I wonder if it’s still maintained, or if there is need to actually fork it to improve it.

Comments 3
  1. It isn’t a matter of the malloc/realloc functions not being safe, but rather duplicated error checking.Instead of every file having a code block like so:void *ptr = malloc(some_size);if (ptr == NULL) { print some error message; abort;}You put this construct in as xmalloc() and then all of your code gets this error checking for free.

  2. I know what (usually) xmalloc/xrealloc do, and that’s why I think they are deemed not safe to use by themselves. I’m not talking about security-safe, but rather as working-safe, in this instance.

  3. very good job diego! i’m actively following your work 🙂 and now i’m an addicted fan.i agree with you about removing all the xmalloc/xrealloc exported symbols and making them as ‘static’.p.s.: your last link is wrong, there is an extra “;” char at the end of the link.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.