This Time Self-Hosted
dark mode light mode Search

I need a (log) analyst

Since I started working on my own tinderbox experiments, I was able to file around a few hundreds different bugs, all by hand, for various issues like pre-stripped files, packages installing files in invalid paths like /usr/man and /usr/doc, packages bundling libraries, packages needing to learn EAPI=1, and so on so forth.

Today, I added to that list a few more packages having maintainer-mode induced rebuilds which are bad for so many reasons; unfortunately I knew I already filed a few and I really didn’t want to re-file them again if I did. So I decided I needed a way to quickly check if an issue found in a package was already reported or not. This is particularly useful for stuff like pre-stripped files, since between me and Patrick we already filed a lot of bugs about them, although, since software sucks, I’m sure we didn’t file all of them.

First I found an interesting sh oneliner approach, which was pretty cool, but wasteful, and also still required me to check each package’s metadata to find who to assign it to, each time. Since I wanted to do as few lookups as possible, I decided I needed something more complex, and I’ve started working on a Python (yes, you read it right, Python) script that could analyse log files and identify issues in it, just like I would do with multiple, repeated fgrep commands all over files.

The reason why I choose Python to do this were exactly two: the first is that Portage is written in Python and it was easier to just use Portage’s interface to split package names than changing qatom to do something more useful for sh scripts. The second is that in the future I hope to just make it ask me whether I want to file a bug for the package, and let it just open me the page with the bug form filed in. For now, though, reporting is already quite enough.

I’m not going to release the code of the script just yet; it’s far from finished, my Python coding skills suck, and it would be a bad DDoS to Bugzilla if lots of people were to use it simultaneously since for almost each “condition” found in the log file, a query is made to Bugzilla to check if the bug was reported already. The broadest one I have yet to script in though, and it would hit all the bugs for a package when it fails. With enough time, I hope to be able to detect the most common compiler errors too so that I can also identify unreported GCC failures and similar.

Unfortunately, I reasonably can’t unleash the tool on the full set of logs produced by the tinderbox just yet, since we talk of about 28 thousands log files. In these thousands of log files, some are duplicated, because for instance when asterisk failed, all the asterisk plugin tried to merge it in and my tinderbox tried compiling it multiple times, and others are obsolete because newer versions or revisions were added to the tree, and the old ones might have issue the new ones have fixed. For this reason I’ve asked the near allmighty Zac to find me an easy way to prune the log archive to just keep one log per slot per package, also removing eventual cruft left there for other reasons, hopefully reducing the dataset to something much more manageable.

And the work is not done just by analysing these logs. There is the elog output I have also to make some sense of; in particular I have to identify which packages had fetch restriction turned on (and I didn’t have the file fetched), so that I can get more distfiles for the next run, and which packages used USE-based dependencies and thus require me to enable more USE flags on the packages, and finally I have to check for and report all the package conflicts that are not expressed by blockers (although I already said blockers abuse is bad stuff).

Hopefully in a not too distant future problems like these can be taken care of automatically by an automated tinderbox like Autoua or whatever that’s called. In the mean time I’m going to provide sweatwork for this at least. On the other hand, my tinderbox is taking into consideration more than a few things that Portage and Gentoo as a whole doesn’t care about, like the canonical target problem which I’m trying to push as much as possible upstream. So I guess I’ll be running my tinderbox for quite a long time still.

And of course, I’m accepting endorsements, if you think this is work you can get the fruits of. Would also make it less of a problem for me to get out the money to replace the disk; as I said one started clicking, I would have replaced it already if I didn’t hear about Seagate’s firmware fail. Funnily, I had a similar failure with a 160GB Maxtor drive a few years ago (at the time it was a big drive); and a few friends of mine as well, although neither Maxtor nor the retailers ever admitted it to be related to firmware failures. The problem was solved with MaxBlast software, which was the Maxtor version of the SeaTools that Seagate suggests to use now (by the way…). Suggestions on which brand of disks to buy next is welcome, I now have discarded Samsung (as soon as they run a little hotter they break and I can’t put more fans into Yamato), Maxtor (after the 160GB debacle) and Seagate (after the 7200.11 series).

Comments 5
  1. Hi! I suggest you try out Western Digital RE hard disks. RE stands for Ride Edition. According to manufacturer they are designed to be more reliable than ordinary desktop hard drives while price is around the same. Cheers 🙂

  2. The only other remaining brand is Western Digital, but I’ve had a bad experience with them once. All hard disks suck, but some suck more than others. My suggestion is sticking with Seagate, maybe switching to Barracuda ES or SV35 series, which are supposed to be “Enterprise” drives. Price premium is about 5-10% for the SV35 and 40-50% for the ES, if I’m not mistaken.

  3. I can only recommend WD, most of their 3.5″ products are available in the RE version which are perfect for 24/7 use e.g. in RAID and/or Servers.My “Homeserver” is running on two Caviar Black 1TB RE in raid 1.My Desktop has the system on a 10000rpm 300GB Velociraptor. Games, media to be cut or converted and Networkshares recide on a (Non RE) Caviar Black 1TB. Music, Videos and Pictures reside on 4 Caviar Green 1TB in a Raid 5.At home I never had a failiure of an WD drive, but 2 times Maxtor and one seagate.Since I’m an admin at my school, I can also tell you about our experiences there. From September 2007 to July 2008 about 80% of the HDs used were WD models, 15% Maxtor and 5% old IBM Drives. In this period on the 60 pupil computers we had 4 HD crashes. 1 was WD, 2 were Maxtor, one IBM.In July 2008 we were tasked to replace one of the two computer rooms. This time we chose to use 100% WD drives and even replaced the Maxtor and IBM ones in the other room.Since then we had not a single crash.I hope I helped you with your desicion, and no, I’m not paid by WD.

  4. bq. The reason why I choose Python to do this were exactly two: the first is that Portage is written in Python and it was easier to just use Portage’s interface to split package names than changing qatom to do something more useful for sh scripts.It’s not hard to split package names at all; it’s a simple regex match (on an admittedly baroque regex.) Things only become a bit trickier when you want to compare versions.bq. The second is that in the future I hope to just make it ask me whether I want to file a bug for the package, and let it just open me the page with the bug form filed in.Wouldn’t that indicate perhaps using pybugz?

  5. Actually package names aren’t a problem, except where they use ‘:’ as a name/version separator instead of ‘-‘. Otherwise unpacking and repacking is symmetrical.Will

Leave a Reply to PaoloCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.