Today I filed a new batch of mass-bugs, this time related to the -Werror compiler flag that I talked about last week. To file the bugs, I used once again the output of the tinderboxing run that is being executed in the chroot that is (still) called “asneeded” (it started as a way to test how much stuff breaks with --as-needed
, but is now a generic tinderbox).
Unfortunately, since it’s not a coincidence I went to talk about -Werror
last week, but is related to the new warnings of gcc 4.3.3 which were added by Gentoo, a few packages went fixed between that time and now and the result is that I ended up filing at least two duplicates for stuff that have been fixed already.
Now, just so that people don’t assume that I’m not caring about the time spent by the other developers looking at bugs, I’d like to explain why it ended up like this. The tinderbox is running since last week. Since I check for the packages that have not been installed, or that needs to be updated, or that haven’t been merged in the past six weeks, with each iteration, I don’t usually stop and resume it or it would repeatedly re-compile the same packages (well, not as much as it did the first times, now the order of merge of the packages is random, rather than alphabetical). I also don’t like to stop it and resume it from where it left with a sync in the middle because it can cause further problems. So at the end of the day I just stay without syncing until I need to sync to update my own packages or something.
Since I don’t have updated logs for every package, I tried checking all of the packages merged against the synced tree to see if the problem was fixed already, unfortunately this leaves me with a blind spot of about a week: all the bugs fixed during that one week I will hit nonetheless and results are duplicated bugs.
Yes of course I could check the CVS for each of them, but it would take much more time, to a point where the amount of time I’d have to spend to file a single bug would really be too high for me to accept doing it without an explicit reason. So I usually value the collaboration between developers, and hope they accept the eventual duplicated bug. Which usually happens.
Now, what many people suggested before is that we get the thing much more automated, with automated bugs being filed, logs being mined automatically and so on so forth. My last stake at that has actually made me file more duplicated that doing it manually. Sure it could be possible to fine tune the reporting further so that it would not find false positives, but as Duncan said, I should do what I’m good at, and sincerely log analysis software is not my area of expertise, nor I’d like for it to be sincerely.
Sure there have been quite a few projects related to automated build and reporting of problems in Gentoo, if I remember correctly one was in the Google Summer of Code projects from last year. But I haven’t seen much bugs filed through that, I think the ones that filed bugs for mostly-unknown and mostly-unused packages have been Patrick and me, which says that there is something missing here.
Now, please consider I do have much better things to do than reading through logs manually to check whether -Werror
is used for configure-time checks (like in xine and other autotools-based software I work on, just as an example), or if it’s actually used during build. And that I do quite know that most of my bugs are pre-emptive, and show issues that have not been hit yet, but that’s the whole point. If I report to you the bug before it hits users, I’m doing so at my own expense of time to avoid you and the users from wasting more time when the problem arises and you don’t know what to do.
So, I beg you, bear with me if I file a few dupes from time to time, okay?
It’s hard to look true all the logs when runing tinderboxing. Have start to use the die hook’s in portage and /etc/portage/bashrc for filtring the logs.
How about posting the full build.log (even if there’s no failure) on the bug report?Then we could host a big bug day where volunteering bugsquashers would grep through the logs to find if they are actual issues or just false-positives.What do you think?
Rémi, I usually do, unless the issue is/seems to be obvious.I skip it for pre-stripped (and just report the prestripped files) and “one out of these USE flags needs to be set” kind of bugs because it rarely seems to be important.On the other hand, I’m a bit concerned about submitting all the build logs each time, they tend to be quite big, I wonder if I’m slowing down Bugzy.
On an even more general note, is anyone checking the actual structure of ebuilds?I’ve been trying to track down dependencies, which means parsing ebuilds and extracting the relevant strings, and there are a suprising number (> 1000) ebuilds that don’t seem to conform to what I suspect is the standard. I’m not using bash or python, so these errors, if they are errors, show up at once. They are trivial, except for eg: DEPEND=$DEPEND, but a nuisance to code around.Has any developer got, or is any developer using, a tool to check ebuild structure?Will
Hello, I make a living doing preventive maintenance. I like your ideas and where this seems headed.Can tinderboxing be done in this way for different setups? In essence the idea would be something akin to tinderboxing several errr “profiles?” I guess.Say a vps with a lamp install :)A desktop-xfce setup.A kde devolopment systemI can’t help but hope that this stuff will one day be automated. Checking for breakage etc before it hits the stable tree and users?
Given enough power and a set of packages to consider, it’s trivial to break down the tinderbox in multiple systems.On the other hand, the one I have here is by choice a take-all tinderbox since it allows me to test also for collisions and interference between packages, which is something that is also often ignored.I also hope it can be automated one day, but for now I don’t see much in the way of that, manual human intervention is really needed.