For A Parallel World. Theory lesson n.3: directory dependencies

Since this is not extremely common knowledge, I wanted to write down some more notes regarding the problem that Daniel Robbins reported in Ruby 1.9 which involves parallel make install problems.

This is actually a variant of a generic parallel install failure: lots of packages in the past assumed that make install is executed on a live filesystem and didn’t create the directories where to copy the files on. This of course fails for all the staging trees install (DESTDIR-based install), which are used by all distributions to build packages, and by Gentoo to merge from ebuilds. With time, and distributions taking a major role, most of the projects updated this so that they do create their directories before merging (although there are quite a few failing this still, just look for dodir calls in the ebuilds).

The problem we have here instead is slightly different: if you just have a single install target that depends at the same time on the rules that create the directories and on those that install the files, these doesn’t specify interdependencies:

install: install-dirs install-bins

install-dirs:
        mkdir -p /usr/bin

install-bins: mybin
        install mybin /usr/bin/mybin

(Read it like it used DESTDIR properly). When using serial make, the order the rules appear on the dependency list is respected and thus the directories are created before the binaries; with no problem. When using parallel make instead, the two rules are executed in parallel and if the install command may be executed before mkdir. Which makes the build fail.

The “quick” solution that many come to is to depend on the directory:

install: /usr/bin/mybin

/usr/bin:
        mkdir -p /usr/bin

/usr/bin/mybin: mybin /usr/bin
        install mybin /usr/bin/mybin

This is the same solution that Daniel came to; unfortunately this does not work properly; the problem is that this dependency is not just ensuring that the directory exists, but it also adds a condition on the timestamp of modification (mtime) of the directory itself. And since the directory’s mtime is updated whenever the mtime of its content is updated, this can become a problem:

flame@yamato foo % mkdir foo   
flame@yamato foo % stat -c '%Y' foo
1249082013
flame@yamato foo % touch foo/bar
flame@yamato foo % stat -c '%Y' foo
1249082018

This does seem to work for most cases, and indeed a similar patch was added already to Ruby 1.9 in Portage (and I’m going to remove it as soon as I have time). Unfortunately if there are multiple files that gets installed in a similar way, it’s possible to induce a loop inside make (installing the latter binaries will update the mtime of the directory, which will then have an higher mtime than the first binary installed).

There are two ways to solve this problem, neither look extremely clean, and neither are prefectly optimal, but they do work. The first is to always call mkdir before installing the file; this might sound overkill, but using mkdir -p it really has a small overhead compared to just calling it once.

install: /usr/bin/mybin

/usr/bin/mybin: mybin /usr/bin
        mkdir -p $(dir $@)
        install mybin /usr/bin/mybin

The second is to depend on a special time-stamped rule that creates the directories:

install: /usr/bin/mybin

usr-bin-ts:
        mkdir -p /usr/bin
        touch $@

/usr/bin/mybin: mybin usr-bin-ts
        install mybin /usr/bin/mybin

Now for Ruby I’d sincerely go with the former option rather than the latter, because the latter adds a lot more complexity and for quite little advantage (it adds a serialisation point, while mkdir -p execute in parallel). Does this help you?

Exit mobile version