Bigger, better tinderbox

Well, not in the hardware sense, not yet at least, even though it’d be wicked to have an even faster box here (with some due control of course, but I’ll get back to that later). I’ll probably get some more memory and AMD Istanbul CPUs when I’ll have some cash surplus — which might not be soon.

Thanks to Zac, and his irreplaceable help, Portage gained a few new features that made my life as “the tinderbox runner” much easier: collision detection now is saved in the build/merge log, this way I can grep for them as well as for the failures; the die hook is now smarter, working even in case of failures coming from the Pythons side of Portage (like collisions) and it’s accompanied by a success hook. The two hooks are what I’m using for posting to the whole coming of the tinderbox (so you can follow that account if you feel like being “spammed” by the proceeding of the tinderbox — the tags allow to have quick look of how the average is).

But it’s not just that; if you remember me looking for a run control for the tinderbox, I’ve implemented one of the features I talked about in that post even without any fancy, complex application: when a merge fails, the die hook masks the failed package (the exact revision), and this has some very useful domino effects. The first is that the same exact package version can only ever fail once in the same tinderbox run (I cannot count the times my tinderbox wasted time rebuilding stuff like mplayer, asterisk or boost and failing, as they are dependencies of other packages), and that’s what I was planning for; what I had instead is even more interesting.

While the tinderbox already runs in “keep going mode” (which means that a failed, optional build will not cause the whole request to be dropped, and applies mostly to package updates), by masking specific, failing revisions of some packages, it also happens to force downgrades, or stop updates, of the involved packages, which means that more code is getting tested (and sometimes it gets luckier as older versions build where newer don’t). Of course the masking does not happen when the failure is in the tests, as those are quite messed up and warrant a post by themselves.

Unfortunately I’m now wondering how taxing the whole tinderbox process is getting: in the tree there are just shy of 14 thousands packages. Of these, some will merge in about three minutes (this is back-to-back from call to emerge to end of the process; I found nothing going faster than that), and some rare ones, like Berkeley DB 4.8, will take over a day to complete their tests (db-4.8 took 25 hours, no kidding). Accepting an average of half an hour per package, this brings us to 7 thousands hours, 300 days, almost an year. Given that the tinderbox is currently set to re-merge the same package over a 10 weeks schedule, this definitely gets problematic. I sincerely hope the average is more like 10 minutes, even thought that will still mean an infinite rebuild. I’ll probably have to find the real average looking through the total emerge log, and at the same time I’ll have to probably reduce the rebuild frequency.

Again, the main problem gets to be with parallel make: I’d be waiting for the load of the system to be pretty high while on the other hand it’s really left always under the value of 3. Quite a few build systems, including Haskell extensions’, Python’s setuptools, and similar does not seem to support parallel build (in case of setuptools, it seems that it calls make directly, so ignoring Gentoo’s emake wrapper), and quite a few packages force serial make (-j1) anyway.

And a note here: you cannot be sure that calling emake will give you parallel make; beside the already-discussed “jobserver unavailable” problem, there is the .NOTPARALLEL directive that instructs GNU make to not build in parallel even though the user asked -j14. I guess this is going to one further thing to look for when I’ll start with the idea of distributed static analysis.

Opening up the tinderbox

As I said before, the tinderbox is hardly parallelisable but on the other hand, it can yield much better results if multiple instances are being executed, independently, by more people. Of course, this also requires that the executions are somewhat coordinated so that they don’t execute the exact process over and over, but rather some slight variation (different architecture, compiler, flag, basic USE settings, etc.).

Now, while Mark has been working on setting up a tinderbox for PPC64, I wanted to publish the scripts that I’ve been using all this time; I did so a few weeks ago by posting them but since then more problems and more solution came up. So today thanks to Tomas (why does the roll call show the “normalised” name? I’m pretty sure his name is not just ASCII) I started publishing the scripts in a public git repository which both other developers and interested users can use to improve, simplify and extend the tests.

If you look at the scripts and compare them with the old versions, you can see that I have made a few important changes, the first of which would be the presence of bti in them. Yes, I’m currently denting away the tinderbox results so that you all can follow them. This also gives a bit of insight of how the tinderbox works even to those who don’t want to look into the dirty details of the code.

The rest of the changes are vastly thanks to Zac: the first is that the merge operations are now running with --update --selective=n so that all the dependencies are considered as soon as possible, this solves some nasty deadlock cases, like gvim, vim and vim-core dependencies rolling around to the point of being rejected by portage. Unfortunately, this also calls for having a way to get some package out of the build loop; and I don’t think that a complex solution like Gearman is what I should be looking for now.

The other change is still incomplete for now as I wait for bug #295715 to be released: when a package fails to be merged (right now only if the ebuild fails; once complete even if it fails because of collisions) it gets masked in a temporary file that is cleaned up at the next restart of the round. This way when a dependency fails all the packages that depend on it will automatically be rejected (or will keep using the old merged version if present, or fall back to an older version if that works). This helps reducing the time wasted trying and re-trying the same package over and over again.

I also dropped the test for AC_CANONICAL_TARGET since that produces way too much noise, and it’s rather something that could be made to work with the static analysis idea that I got. With that, it’d be also easier to check for bashisms and other issues without adding noise to an already full log as those of the tinderbox are.

There is one very heavy check that is that to ensure that binchecks-restricted packages are not installing ELF files; the original idea for that restriction was to avoid running a number of ELF checks and mangling over non-ELF packages, such as kernel sources, fonts and similar. That is quite an issue when using virtual systems (where I/O has a nasty overhead) and is pointless for packages that we can be sure will not install executables; unfortunately a few developers seem to think that it’s a shortcut to avoid dealing with the ELF QA checks, instead of filling the boring bits that tells Portage to expect QA failures.

To reduce the chance of something breaking further down the road due to .la files removal I’ve also made sure lafilefixer is executed on every and each package.

And finally I’ve created a “restart” script that deals with the long procedure of restart of the tinderbox: it syncs, check if gcc has changed, if so makes sure that the as-needed version is selected. Right now it also deals with ghc updates, in the future I hope to be able to handled own all that kind of updates together. The problem there is that I don’t think the script works that well when something fails, as it’s mostly untested for now; and the updater scripts often don’t support the --keep-going option that is exactly what I’d like to use to avoid the domino effect.

In the next days I’ll try to write some more details into what things I end up checking along the way, may be of help to others who want to run their own tinderbox.

Why the tinderbox is a non-distributed effort

In my previous post about a run control for the tinderbox, Pavel suggested me to use Gearman to handle the run control section; as I explained him earlier this evening, this is probably too much right now; the basic run control I need is pretty simple (I can even keep using xargs, if Portage gave me hooks for merge succeeded/failed), the fancy stuff, the network interface, is sugar that Gearman wouldn’t help me with as far as I can see.

On the other hand, what I want to talk about is the reasoning why I don’t think the tinderbox should e a distributed effort, as many people try to suggest from time to time to reduce the load on my machine. Unfortunately to work well in distributed methods, the task has to feasibly become a “divide et impera” kind of task.

The whole point of the tinderbox for me is to verify the interactions between packages; it’s meant to find which packages break when they are used together, among other things, and that kind of things need for all the packages to be present at the same time, which precludes the use of a distributed tinderbox.

If you don’t think this is helpful, I can tell you quite a bit of interesting things about automagic deps but since I already wrote about them from time to time I’ll skip over it for now.

That kind of effort that can work with the distributed approach is that taken by Patrick of cleaning-up tinderboxes: after each merge the dependencies gets removed, and a library of binary packages is kept up to date to avoid building them multiple times a day. This obviously makes it possible to test multiple package chains at once in multiple systems, but it also adds some further overhead (as multiple boxes will have to rebuild the same binary packages if you don’t share them around).

On the other hand, I think I got an use for Gearman (an ebuild for which, mostly contributed by Pavel, is in my overlay; we’re working on it to polish): I already mused some time ago about checking the packages’ sources looking for at least those things that can be found easily via scripts (like over-canonicalisation that I well documented already). This is a task where divide-et-impera is very likely a working strategy. Extracting and analysing the sources is an I/O-bound task, not a CPU-bound task, so Yamato’s approach there is definitely a losing one.

To have a single box to have enough I/O speed to handle so many packages you end up resorting to very high end hardware (disks and controllers) which is very expensive. Way too expensive. On the other hand, having multiple boxes, even cheap or virtual (distributed among different real boxes of course) working independently but dividing their queue together, with proper coordination, you probably can beat those performances for less than half the price. Now, for this to work there are many prerequisites, a lot of which I’m afraid I won’t be able to tackle anytime soon yet.

First of all, I need to understand well how Gearman work since I only skimmed through it up to now. Then I need to find the hardware; if I can change my garage into a machine room, and connect it to my network, that might be a good place to start (I can easily use low-power old-style machines, I still have a few around that hadn’t found space to be put lately); I remember some users offering chroots in their boxes before; this might turn out pretty useful, if they can make virtual machines, or containers, they can also work on the analysis, in a totally distributed fashion).

The third problem is somewhat the hardest but the most interesting: finding more analysis to run on the sources; without building stuff. Thankfully, I have got the book (Secure Programming with Static Analysis) to help me coping with that task.

Wish me luck, and especially wish me to find time to work on this.

Needing a run control

You might not be familiar with the term “run control” even though you use openrc; guess what the rc stands for?

This post might not be of any interest to you; it’s going to delineate some requirements for my tinderbox to be expanded and streamlined. So if you don’t care to help me or know not of development, you can skip it and wait for the next one tomorrow.

As the title leave you to guess, I’m looking for a better way to handle the execution of the tinderbox itself. Right now as I shown you, I’m just using a simple xargs call that then takes care of launching a number of emerge requests to build the list of packages one by one. Unfortunately this approach has quite a few problems, the first of which is that I have no way to check if the tinderbox is proceeding properly without connecting to its console. Which is quite taxing to do especially when I’m not at home.

I could write a shell script to handle that; but on the other hand I’d rather have something slightly more sophisticated, and more powerful. Unfortunately because the whole design of the tinderbox relies so heavily on the Portage internals, I guess the language of choice for something like this should probably be Python, so, for instance it can call the tinderbox script via function call rather than forking.

What I’d be looking for, right now, would be a daemon: have this daemon run in the background started automatically by the init system inside the container, with two ports open for remote control: one, with the command console that allows for starting and suspending the execution, or aborting the current run, and one for the logging console that shows what emerge is doing; the whole thing would look a lot like the ssh/@screen@/@xargs@ execution I’m doing right now, just, less locally-tied. Bonus points if the whole system only allows for SSL connection using known client certificates.

My reasons to wanting a separate run control than just ssh to the console is to allow for other developers eventually to access the tinderbox, for instance to prioritize one particular package over another, or if needed to change the settings (flags and similar) for a particular execution. In the case the client authentication was too much to implement, it could probably be easily solved by creating a script using nc6 to talk to the console, and using that as a shell, leaving access through SSH (with non-root accounts).

Another reason for this is to better handle the cascading failure of dependencies. If I’m going to merge package C that requires A and B, but B fails, I should be masking that for the time of the current run. This way when I’m asked to install D that also requires B, it’ll be skipped over (instead of insisting rebuilding the same package over and over). At the end of the run (which would mean at the next --sync), those masking can be dropped entirely.

This means that at the end of an emerge request, you need to find if it completed fine or not, and if not, on which package it failed.

Other features for the run control would sprout once this basis is done, so if anybody is interested in helping out with the tinderbox and wants to start with writing this type of code, it’s definitely welcome.

What the tinderbox is doing for you

Since I’ve asked for some help with the tinderbox also with some pretty expensive hardware (the registered ECC RAM), I should probably also explain well what my tinderbox is doing for the final users. Some of the stuff is what any tinderbox would do, but as I wrote before, a tinderbox alone is not enough since it only ties into a specific target, and there are more cases to test.

The basic idea of the tinderbox is that it’s better if I – as a developer – file bugs for the packages, rather than users: not only I know how to file the bug, and often I know how to fix the bug as well; there is the chance that the issue will be fixed without any user hitting it (and thus being turned down by the failure).

So let’s see what my effort is targeted to. First of all, the original idea from which this started was to assess the --as-needed problem to have some data about the related problems. From that idea the things started expanding: while I am still running the --as-needed tests (and this is pretty important because there is new stuff that gets added that fails to build with --as-needed!), I also started seeing failures with new versions of GCC, and new versions of autotools, and after a while this started being a full-blown QA project for me.

While, for instance, Patrick’s tinderbox focuses on the stable branch of Gentoo, I’m actually focusing on unstable and “future” branches. This means that it runs over the full ~arch tree (which is much bigger than the stable one), and sometimes even uses masked packages, such as GCC, or autoconf. This way we can assess whether unmasking a toolchain package is viable or if there are too many things failing to work with it still.

But checking for these things also became just the preamble for more bugs to come: since I’m already building everything, it’s easy to find any other kind of failure: kernel modules not working with the latest version, minimum-flag combinations failing, and probably most importantly parallel make failures (since the tinderbox builds in parallel). And from there to adding more QA safety checks, the step was very short.

Nowadays, among the other things, I end up filing bugs for:

  • fetch failures: since the tinderbox has to download all the packages, it has found more than a couple packages that failed to fetch entirely — while the mirroring system should already report failed fetches for mirrored packages, those with mirror restrictions don’t run through that;
  • the --as-needed failures: you probably know already why I think this flag is a godsend, and yet not all developers use it to test their packages when they add it to portage;
  • parallel make failures, and parallel make misses: not only the tinderbox runs with parallel make enabled, and thus hits the parallel make failures (that are to be worked around to avoid users hitting them keeping a bug open to track them ), but thanks to Kevin Pyle it also checks for direct make calls which is a double-win, as stuff that would otherwise not use parallel make does in the tinderbox and I can fix the to either use emake for all or emake -j1 and file a bug;
  • new (masked) toolchains: GCC 4.4, autoconf 2.65, and so on: building with them when they are still masked helps identifying the problems way before users might stumble into them;
  • failing testsuites: this was added together with GCC 4.4 since the new strict-aliasing system caused quite a bit of packages to fail at runtime rather than build-time; while this does not make the coverage perfect, it’s also very important to identify --as-needed failures for libraries not using --no-undefined;
  • invalid directories: some packages install documentation and man pages out of place; others still refer to the deprecated /usr/X11R6 path; since I’m installing all the packages fully, it’s easy to check for such cases; while these are often petty problems, the same check identifies misplaced Perl modules which is a prerequisite for improving the Perl handling in Gentoo;
  • pre-stripped files in packages: some packages even when compiling source code tend to strip files before installing them; this is bad when you need to track down a crash because just setting -ggdb in your flags is not going to be enough;
  • maintainer-mode rebuilds: when a package causes maintainer-mode rebuild, it often executes configure twice, which is pretty bad (sometimes it takes more to run that script than to build the package); most users won’t care to file bugs about them, since they probably wouldn’t know what they are either, while I can usually do that in batch;
  • file collisions, and unexpected automagic dependencies are also tested; while Patrick’s tinderbox cleans the root at every merge, like binary distributions do in their build servers, and finds missing mandatory dependencies, my approach is the same applied by Gentoo users: install everything together; this helps finding file collisions between packages as well as finding some corner cases where packages fail only if a package is installed.

There are more cases for which I file bug, and you can even see that from the tinderbox quite a few packages were dropped entirely as unfixable. The whole point of it is to make sure that what we package and deliver to users builds and especially builds as intended, respecting policies and not feigning to work.

Now that you know what it’s used for… please help .

More tinderbox notes, just to say

To complete the topic I started with the previous post I would like to give you some more notes about the way the tinderbox work, and in particular about the manual fiddling I have to do on it to make sure that it works smooth; and the issues that haven’t been tackled yet.

As I said the tinderbox is a Linux Container; this helps isolating the test environment from my workstation: killall and other unbound arguments are never going to hit my system, which is good. On the other hand, this still have a couple of rough patches to go through. Even with the latest (unreleased) version of lxc, /dev is either statically created or bound: udev does not work properly inside the container, for somewhat obvious reasons. The problem of that is that if you bind the /dev directory (or mount devtmpfs that is basically the same thing with a recent kernel), then you’ll have only one directory were FIFOs and sockets are created.

This not only causes sysvinit to shut down the host instead than the container if you use the shutdown command, but also makes it impossible to have a running, working syslog from within the container. While this shouldn’t hinder the tinderbox work, but seems like it does .

Another problem is with something all users have to fight with every time: incompatible language updates: Python, Perl, OCaml, Haskel, you name it. Almost all of these languages come with an “updater” script that is intended to rebuild their extensions and libraries to make sure that they are again compatible with the new release; failing to run these scripts will introduce quite a few failure cases within the tinderbox that, obviously, will be spurious. The same goes for lafilefixer. I’ll probably have to write a maintenance script to improve the flow of that, so I don’t forget steps around.

Yamato’s hardware is designed to work in parallel (as Mike also found out it seems like the sweet spot for build is number of cores per two); so another problem that adds to the tinderbox is that it does sequential merging of everything: making that parallel it is quite hard because of interdependencies of packages. So to speed stuff up, the build process itself has to be parallel-safe; which you probably know it often is not and which is one of the reasons why I often fix packages around.

One pretty bad example of time wasted because of serial-make runs is boost: almost four hours yesterday for the merge, because the tests are built and executed in series: instead of building all the test binaries and then executing them in series (which is a good compromise if you cannot run them in parallel), it goes on building and testing; the result is obviously pretty bad on my system.

Quite a few times, by the way, the whole situation is exasperated by the fact that the build failures were already reported, often times by me, last year. Yep, we got year-old build failures in tree that hit users. And guess what? At least a couple of time the proposed solution is “use an overlay”. No, the right solution is not to let software bitrot in the tree!

Anyway, thanks Genone who sent me the patch to have better collision diagnostics, and thanks Mauro who’s working on new bashrcng plugins for the QA tests. Hopefully, some of the tests will also find their way into Portage soon; and again, I’ll suggest you consider the idea of contributing somehow (if you cannot contribute by code or fixes) — might not be extremely difficult to deal with the tinderbox, but sure is time-consuming, and time, well, is money…

Tinderbox: explaining some works

Many people asked before to explain better how my tinderbox works so that they can propose changes and help. Well, let me try to explain more how the thing is working so I can actually get some suggestions, as right now I’m a bit stuck.

Right now the tinderbox is a Linux Container that runs in Yamato; since Linux Containers are pretty lightweight, that means it has all the power of Yamato (which is actually my workstation, but it’s an 8-way Opteron system, with 16GB of Registered ECC RAM and a couple of terabytes of disk space).

Processing power, memory and disk space are shared between the two so you can guess that while I’m working with power-hungry software (like a virtual machine running a world rebuild for my job), the tinderbox is usually stopped or throttled down. On the other hand this makes it less of a problem to run, since Yamato is usually always running anyway. If I had to keep another box running just for the tinderbox, the cost in electrical power would be probably too high for me to sustain for a long time). The distfiles are also shared in a single tree (among also the virtual machines, other containers and chroots so that makes it very lightweight for Yamato to run in the background).

Since it’s an isolated container, I access the tinderbox through SSH, from there I launch screen and then I start the tinderbox work; yes it’s all manual for now. The main script is that Zac wrote for me some time ago; this lists all the packages that haven’t been merged in the given time limit (6 weeks), or that have been bumped since the last time they were merged. It also spews on the standard error if there are USE-based dependencies that wouldn’t be satisfied with the current configuration (since the dependencies are brought in automatically, I just need to make sure the package.use file is set properly.

The output of this script is sorted by name and category; unfortunately I noticed that doing so would isolate too many of the packages at the bottom of the list, so to make it more useful, I sort it at random before saving it to a file. That file is then passed as argument to two xargs calls: the tinderbox itself, and the fetcher. The tinderbox itself has this command xargs --arg-file list -n1 emerge -1D --keep-going which means that each package listed is tried to install with its dependencies brought in, and if some new dependency fails to build (but the old is present) it’s ignored.

Now that you’ve seen how the tinderbox is executed you can see why I have a separate fetcher: if I were to download all the sources inline with the tinderbox (which is what I did a long time ago) I would end up having to wait for the download to complete before it would start the build, and thus add a network-bound latency to the whole job which is already long enough. So the fetcher runs this: xargs --arg-file list emerge -fO --keep-going which runs a single emerge to fetch all the packages. I didn’t use multiple calls here because the locks on vdb would make the whole thing definitely slower than what it is now; thanks to --keep-going it doesn’t stop when one package is unfetchable at least.

Silly note here: I noticed tonight while looking at the output that sometimes it took more time to resolve the same host name than fetching some smaller source file (since Portage does not yet reuse the connection as it’s definitely non-trivial to implement — if somebody knows of some kind of download manager that keeps itself in the background to reuse connections without using proxies I’d be interested!). The problem was I forgot to start nscd inside the tinderbox… took a huge hit from that, now it’s definitely faster.

This of course only shows the running interface; there are a couple extra steps involved though; there is another script that Zac wrote me: that lists the packages that are installed in the tinderbox but are now unavailable, for instance because they were masked, or removed. This is important so I can keep a clean system from stuff that has been dropped because it was broken and so on. I run this each time I sync, before starting the actual tinderbox list script.

In the past the whole tinderbox was much hairier; Zac provided me with patches that let me do some stuff that the official portage wouldn’t do, that made my job easier, but now all of them are in the official Portage, and I just need to disable the unmerge-logs feature as well as enable the split-log one: the per-category split logs are optimal to submit them to the bugzilla, as Firefox does not end up chocking while trying to load the list of files.

When it’s time to check the logs for failures (of either build/install or tests, since I’m running with FEATURES="test test-fail-continue"), I simply open my lovely emacs and run this grep command: egrep -nH -e "^ .**.* ERROR:.*failed" -r /media/chroots/logs/tinderbox/build/*/*:*.log | sort -k 2 -t : -r which gives me the list of logs to look into. Bugs for them are then filed by me, manually with Firefox and my bug templates since I don’t know enough Python to make something useful out of pybugz.

Since I’m actually testing for a few extra things that are currently not checked for by Portage, like documentation to be installed in the proper path, or mis-installed man pages, I’ve got a few more greps rounds to run in the completed logs to identify them and report them, also manually. I should clean up the list of checks but for now you got my bashrc if you want to take a peek.

The whole thing is long, boring, and heavy on maintenance; I have still to polish some rough edges, like a way to handle the updates to the base system before starting the full run, or a way to remove the packages broken by ABI changes if they are not vital for the tinderbox operations (I need some extra stuff which is thus “blessed” and supposedly never going to be removed, like screen, or ruby to use ruby-elf).

There are currently two things I’d like to find a way to tweak in the scripts. The first is a way to identify collision problems: right now those failures gets only listed in the elog output and I have no way to get the correct data out without manually fiddling a lot with the log, which is suboptimal. The second problem is somewhat tied to that: I need a scoring system that drops all the packages that failed to merge to drop down in the list of future merges: build failures and collisions alike. This would let me spend more time building untested packages than rebuilding those that failed before.

If you want to play with the scripts and send me improvements, that’s definitely going to be helpful; a better reporting system, or a bashrcng plugin for the QA tests (hint, this was for you Mauro!) would be splendid.

If you still would like to contribute to the tinderbox effort without having knowledge of the stuff behind it, there are a couple of things you can get me that would definitely be helpful; in particular a pretty useful thing would be more RAM for Yamato; it has to be exactly the same as the one that I got inside, but luckily, I got it from Crucial so you can get it with the right code: CT2KIT51272AB667 — yes the price is definitely high, I paid an even higher price though for it, though. If you’d like to contribute this, you should probably check the comments, in the unlikely case I get four pairs of those. I should be able to run with 24, but the ideal would be to upgrade from the current 16 to 32 GB; that way I would probably be able to build using tmpfs (and find eventual problems tied to that as well). Otherwise check the donations page or this post if you’re looking for more “affordable” contributions.

And finally, the Portage Tree overhead data

I’m sorry it took so long but I had more stuff to write about in the mean time, and I’m really posting stuff as it comes with some pretty randomly ordered things.

In the post about the Portage Tree size I blandly and incompletely separate the overhead due to the filesystem block allocation from the rest of size of the components themselves. Since the whole data was gathered a night I was bored and trying to fixing up my kernel to have both Radeon’s KMS and the Atheros drivers working, it really didn’t strike as a complete work, and indeed it was just to give some sense of proportion on what is actually using up the space (and as you might have noticed, almost all people involved do find the size, and amount, of ChangeLogs a problem). Robin then asked for some more interesting statistics to look at, in particular the trend of the overhead depending on the size of the filesystem blocks.

This post, which comes after quite some angst is going to illustrate the results, although they do tend to be quite easy to see with the involved graphs. I hope this time the graphs do work for everybody out of the box; last time I used Google Docs to produce the output and linked it directly, this saved a lot of traffic on my side, but didn’t work for everybody. This time I’m going to use my blog’s server to publish all the results, hoping it won’t create any stir on it…

First of all, the data; I’m going to publish all the data I collected here, so that you can make use of it in any way you’d like; please note that it might not be perfect, knowledge about filesystems isn’t my favourite subject, so while it should be pretty consistent, there might be side-effects I didn’t consider; for instance, I’m not sure on whether directories have always the same size, and whether that size is the same for any filesystem out there; I assume both of these to be truths, so if I did any mistake you might have to adapt a bit the data.

I also hasn’t gone considering the amount of inodes used for each different configuration, and this is because I really don’t know for certainty how that behaves, and how to find how much space is used by the filesystem structures that handle inodes’ and files’ data. If somebody with better knowledge of that can get me some data, I might be able to improve the results. I’m afraid this is actually pretty critical to have a proper comparison of efficiency between differently-sized blocks because, well, the smaller the block the more blocks you need, and if you need more blocks, you end up with more data associated to that. So if you know more about filesystems than me and want to suggest how to improve this, I’ll be grateful.

I’m attaching the original spreadsheet as well as the tweaked charts (and the PDF of them for those not having OpenOffice at hand).

Overhead of the Gentoo Tree Size

This first graph should give an idea about the storage efficiency of the Gentoo tree changes depending on the size block size: on the far left you got the theoretical point: 100% efficiency, where only the actual files that are in the tree are stored; on the far right an extreme case, a filesystem with 64KiB blocks… for those who wonder, the only way I found to actually have such a filesystem working on Linux is using HFS+ (which is actually interesting to know, I should probably put in such a filesystem the video files I have…); while XFS supports that in the specs, the Linux implementation doesn’t: it only supports blocks of the same size of a page, or smaller (so less than or equal to 4KiB) — I’m not sure why that’s the case, it seems kinda silly since at least HFS+ seems to work fine with bigger sizes.

With the “default” size of 4KiB (page size) the efficiency of the tree seems to be definitely reduced: it goes down to 30%, which is really not good. This really should suggest everybody who care about storage efficiency to move to 1KiB blocks for the Portage tree (and most likely, not just that).

Distribution of the Gentoo Tree Size

This instead should show you how the data inside the tree is distributed; note that I dropped the 64KiB-blocks case, this because the graph would have been unreadable: on such a filesystem, the grand total amounts of just a bit shy of 9GB. This is also why I didn’t go one step further and simulated all the various filesystems to compare the actual used/free space in them, and in the number of inodes.

*This is actually interesting, the fact that I wanted to comment on the chart, not leaving them to speak for themselves, let me find out that I did a huge mistake and was charting the complete size and the overhead instead of the theoretical size and the overhead in this chart. But it also says that it’s easier to note these things in graphical form rather than just looking at the numbers.*

So how do we interpret this data? Well, first of all, as I said, on a 4KiB-sized filesystem, Portage is pretty inefficient: there are too many small files: here the problem is not with ChangeLog (who still has a non-trivial overhead), but rather with the metadata.xml files (most of them are quite small), the ebuilds themselves, and the support files (patches, config files, and so on). The highest offender of overhead in such a configuration is, though, the generated portage metadata: the files are very small, and I don’t think any of them is using more than one block. We also have a huge amount of directories.

Now, the obvious solution to this kind of problems, is, quite reasonably actually, using smaller block sizes. From the reliability chart you can see already that without going for the very-efficient 512 bytes blocks size (which might starve at inode numbers), 1 KiB blocks size yields a 70% efficiency, which is not bad, after all, for a compromise. On the other hand, there is one problem with accepting that as the main approach: the default for almost all filesystems is 4KiB blocks (and actually, I think that for modern filesystems that’s also quite a bad choice, since most of the files that a normal desktop user would be handling nowadays are much bigger, which means that maybe even 128KiB blocks would prove much efficient), so if there is anything we can do to reduce the overhead for that case, without hindering the performance on 512 bytes-sized blocks, I think we should look into it.

As other have said, “throwing more disks at it” is not always the proper solution (mostly because while you can easily find how to add more disk space, it’s hard to get reliable disk space. I just added two external WD disks to have a two-level backup for my data…

So comments, ideas about what to try, ideas about how to make the data gathering more accurate and so on are definitely welcome! And so are links to this post on sites like Reddit which seems to have happened in the past few days, judging from the traffic on my webserver.

The size of the Gentoo tree

You might have noticed that I started working on cleaning up the tree (before I had a few problems with my system, but that’s for another day). Some people wondered whether that’s really going to make much difference, so I wanted to take a look at it myself. I was already quite sure that, while reducing the size of filesdir is important, especially to avoid more stuff to be added to the tree, getting rid of all the filesdir wouldn’t really make a terrible impact. Some extra time at hand, some find commands later, and Google Docs, lead to this:



As you can see, the big part of the tree is ate up by the support files, more than twice the size of all the ebuilds; files/ directories are just little more than ebuilds, and there is a huge amount of filesystem allocation overhead, even if my tree is in a filesystem with 1 KiB blocks. Another interesting note is that the licenses use up more space than the whole set of profiles, scripts and eclasses!

For the sake of finding something to work on, let’s break up the support files class into a different graph:



So much for those who complained that adding information about packages in metadata.xml is wasting space for users… the real space waster are change logs instead! But they are useful to keep around, for a while at least. I guess what we really need is better ChangeLog integration in repoman, so that a) it updates the ChangeLog on commit (stopping developers from committing without updating them!) and b) it can delete older and obsolete entries (like, keep the most recent 40 changes or so).

Update (2017-04-30): Unfortunately the spreadsheet links I used are now broken, so no graphs are available right now.

Proper dependencies aren’t overcomplex

It seems like somebody read my previous post about using Gentoo for building embedded-system filesystems as a mission that would increase the complexity of the work for Gentoo, and would waste time for no god reason. And if you look at the post, I do call that bullshit, for a good reason: proper dependencies are not going to increase complexity of either ebuilds nor the work needed by the packager, they are only extensions to the standard procedure.

Let’s take, as example, the zlib package: it’s part of the system set and some people say that this is enough to ignore adding it to the dependencies. Why? Well that’s a very good question: most of the times the reason I’ve been given was to avoid cyclic dependencies, but given zlib has no dependencies itself… Instead, what do we gain, if we actually add it to all the packages that do use it, you have proper reverse-dependency information, which can be used for instance by a much more sophisticated tinderbox to identify which packages need to be rebuilt when one changes.

At the same time, the correct depgraph will be used by Portage to properly order zlib before any other package that do use it is merged; this is quite useful when you broke your system and you need to rebuild everything. And it’s not all, the idea is that you only need to specify dependencies on system packages only for other packages possibly in the system set; the problem is: how can you be certain you’re not in the system set? If you start to consider that pambase can bring gnome in the system set, it’s not really easy, and it’s a moving target as well.

So I beg to differ regarding complexity: if you simply follow the rule if it uses foo, it depends on foo the complexity will be reduced over time rather than increased: you don’t have to check whether foo and your package are in system set or not. The only two packages that QA approves of not depending upon are the C library and the compiler: all the rest has to be depended upon properly.

And in respect of the already-noted bug with distutils eclass: the same problem exists for other eclasses like apache, webapp and java, that would add their own dependencies by default… but have an explicit way to disable the dependencies and the code or tie them to an USE flag. You know, that is not complexity; that is a properly-designed eclass.