As I wrote before, sqlite has big performance issues which makes running my analysis tool a way too long task.
To avoid wasting time running it, and to be able to run it on bigger datasets, I’ve been planning to port it to use PostgreSQL instead. While having the script almost self-containing was a nice thing, the huge amount of time wasted by sqlite, and its way to convert a CPU-bound problem into an I/O-bound one, forced me to take a different approach.
PostgreSQL is not really that common compared to MySQL, but that’s what I do run for my own software, and what I’ll continue working with in the future, so I’ve preferred that. I’m gladly taking patch to split the code handling database out of the harvesting logic, so that one can choose between PostgreSQL and other implementations, even SQLite again.
The time requested to do harvesting and analysis is now down to less than two hours, from an original of a value between 2 and 6 hours. Isn’t it nice?
I’m looking at the output by hand at the moment, it’s actually listing a lot more data than last time because this time I had the idea of running it as root, so that it could access data about suid non-readable programs. I’ll improve the suppression file a lot tonight and run it again when I’m done removing false positives.
Interestingly enough, there seems to be a lot of binaries exporting
xstrdup symbols. It’s interesting because it shows that the original
strdup functions are not considered safe enough by quite a bit of software, and also because I think those should not be exported, but rather used inline, or at least static or hidden. Can we be certain that all those binaries don’t use different meanings for
xmalloc and the like?
Symbol xmalloc@@ (64-bit UNIX System V ABI AMD x86-64 architecture) present 12 times
You can see that a few packages using that symbol are part of GNU. If there is anyone here who’s a GNU insider and can try to get people to hide that symbol, or use it inline, that would be quite helpful 🙂
And recode seems to be again in the list of the bad guys like for flex symbols ; I wonder if it’s still maintained, or if there is need to actually fork it to improve it.