For A Parallel World. Theory lesson n.2: handling broken ebuilds

Up to now in my series I’ve written about fixing upstream projects and I’ve given hints on how to design a properly parallel-safe build system. I haven’t written anything yet about handling the ebuilds.

While my proposal for replacement of simple makefiles would take care of most minor parallel make issues, it is limited to fixing very broken build systems since for totally non-complex software, parallel make is not an issue at all; most problems happen with complex custom rules. For all the most complex cases, you need to fix the build system appropriately, patch it down and so on.

But before you can get to that you have to take care of handling the ebuild correctly. While it’s certainly not a cool thing for owners of multicore systems to serialise a build, it’s also not good for them to have a package failing, even if it’s during a limited time before the build system is fixed. But if you add -j1 to an emake call, while you review what the problem is, there is a huge chance that the problem will remain there, hidden.

So when you have to deal with such a problem, my suggestions are these:

  • make sure that you check for if the build fails with parallel make; this involves checking it multiple time at multiple levels on a true multicore system; the reason for this is that parallel build issues are race conditions, they might require specific conditions to show up;
  • if you can identify for sure there is a parallel make issue, open a bug for it; even if it’s your package, open a bug for it, it will help you track it down; having a bug for each failure is very important since you need to know that there is a bug to ensure it’s fixed;
  • add -j1 to the ebuild’s emake call; this is a temporary measure, an hack, something that you should never rely on; but having it there will prevent build failures until you can fix the original bug;
  • write a comment referencing the bug you just open where -j1 was added, this will ensure that finding the reason for the non-parallel make will just require a bug lookup rather than searches and searches;
  • when you commit the ebuild, make sure the ChangeLog also references the bug number, make it as noticeable as possible that there is still a bug and you’re just working it around;
  • and the critical part of this: keep the bug open! Some developers hate having bugs open and would rather close everything even when they work around the bug rather than fix it, and wait for upstream or someone else to fix it properly, this is a mistake here: you have to leave the bug open.

When you add -j1 to an ebuild, you’re doing so as a contingency measure, to avoid users complaining that the package does not build; but you’re more than likely to have users complaining that the package does not use parallel make either, and they are right on that, it should. By closing the bug, you’re telling them to “go away” since the package builds, which is not what you should be doing. Instead you should acknowledge that there is a bug, and that it has to be fixed.

So if you find a bug from me about parallel make issue and I changed the ebuild to force -j1, don’t dare closing the bug or I might really get annoyed… if you don’t know how to fix it, just ask me, savvy?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s