Here comes another case study fof fixing parallel make issues, in this case, I’m going to talk about a parallel make issue that does not cause the build to abort, but that forces serial make even when parallel make is requested.
If you look closely at the build messages coming out of various packages you might notice from time to time the error “jobserver unavailable” coming from make. When that warning is outputted, it means that GNU make is unable to properly handle parallel builds since it does not know how to discipline the build, for instance, this comes from the build of xfsprogs:
flame@yamato xfsprogs-2.10.1 % make -j16 === include === gmake: warning: jobserver unavailable: using -j1. Add `+' to parent make rule.
I have to say that GNU make here is very nice with its messages: it does not simply say that the jobserver is unavailable, it also tells you that it is going to use
-j1 and that you should add a plus sign to the “parent make rule”. But I guess most people wouldn’t know how to deal with this. Let’s look deeper.
The build system of xfsprogs is based on autoconf and libtool, but it’s custom made (which by itself caused me quite a few headaches in the past and I still loathe). It is also recursive just like automake based buildsystem, but how does it recurse? The main Makefile contains this:
default: $(CONFIGURE) ifeq ($(HAVE_BUILDDEFS), no) $(MAKE) -C . $@ else $(SUBDIRS_MAKERULE) endif
To find SUBDIRS_MAKERULE we have to dig a lot deeper, finally we can find it in
SUBDIRS_MAKERULE = @for d in $(SUBDIRS) ""; do if test -d "$$d" -a ! -z "$$d"; then $(ECHO) === $$d ===; $(MAKEF) -C $$d $@ || exit $$?; fi; done
So it’s serialising the subdirectories build, what is the problem here? The problem is that GNU make, to implement parallel build, requires special options and descriptors to be passed over the sub-make calls, this happens automatically when
make is invoked directly or through
$(MAKE) but if it’s indirected through variables, then it’s not happening automatically and the developer has to tell GNU make to actually pass the options along.
Now the only problem here is to identify which is the rule that you should add
+ to, but this is very simple since the rule here already has a @ symbol at its start, so just make it @+ and it’ll be done. A very big problem can arise if the rule executes something that is not make together with make (and something more than just test) since then stuff might break hugely.
At any rate, after you actually change this rule (as well as the
SOURCE_MAKERULE one), xfsprogs can finally build in parallel, taking much less time than it otherwise would. Cool, isn’t it?
This is a nice simple fix, but it retains all the bash-iness embedded in the Makefiles. A patch I had sent earlier is more complex, but does things in a more make-like way.There are still a couple problems with your patch; if you do “make -j16” on a true 16-way box, for starters, $(CONFIGURE) launches in parallel & fails:
and it’s off to the races, and sometimes fails. Does your patch allow concurrent subdirs to build, and if so, does anything order such that libs are built before the utilities which must link with them?
Better would be using autotools directly and replace such hackery with another less ugly…
The patch was intended to be a simple fix, rather than a rework 😉 I leave that to those who want to work on a project, myself I just wanted xfsprogs not to stay serial when it could be parallel building at least up to a point, which is more or less what the rest of this series tries to do :)As for the configure stage, I didn’t even consider it myself sincerely, I consider it a prerequisite before running make since this is what autoconf used me to, and I usually need to pass further different options rather than just accepting the defaults.As Luca said, using plain old autotools might have been nicer too.
Sure, the “+” trick was a lot of gain for not much work. 🙂 Don’t take that as a complaint. :)Once I got the full parallelization working, it did speed it up by another factor of 2.I’ll leave autotools to those who grok such things.I did consider taking the configure phase out, it is kind of a nasty hack, but in the end I found details on the right sorts of rules to make it fire off cleanly. I think that’s the patch that’ll end up in xfsprogs.
Oh, and FWIW, I did acl/attr/dmapi/xfsdump too while I was at it. Hopefully sgi will merge them soon.
Hi, Diego, what is <redpre#9>? I have searched, it seems like an html tag. I tried to write it in a html file, but can’t see anything if I used a browser to view it.
It’s typo’s metacode, it shouldn’t creep into the final pages, if you found it on my site I have to fix something.
Hi Diego, I also see the <redpre#9> tag on your page, and this leaves me clueless to the solution you have proposed. Can you please send it to me on my email address. I am struggling with this warning message of jobserver.Thanks, Rajesh.
Ah there you go it should work now!
It appears that whatever you did to fix the appearance of `@’ has morphed into “@” (unless if I’m mistaken? 😉 )
Fantastic…. thanks for the info you saved me a lot of time.
Parallel + recursive make is a joke. Parallel is mostly useful with monolith Make