The disk problem

My first full tree build with dependencies, to check for --as-needed support has almost finished. I have currently 1344 bugs open in “My Bugs” search, that contains reports for packages failing to build or breaking with --as-needed, packages failing to build for other reasons, packages with file collision that lack blockers around them (there are quite a lot, even totally unrelated one with the other), and packages bundling internal copies of libraries such as zlib, expat, libjpeg, libpng and so on.

I can tell you, the amount of tree packages not following policies such as respecting user LDFLAGS, not using bundled libraries, and not installing stuff randomly in /usr is much higher than one might hope for.

I haven’t even started filing bugs for pre-stripped packages since I have to check those for being filed already, by either me in a previous run, or by Patrick with his tinderbox or other people as well. I also wanted to check this against a different problem: packages installing useless debug info using split-debug, by not passing -g/-ggdb properly to the build system and thus not including debug information at all. Unfortunately for this one I need much more free space than I have right now on Yamato. And here I start with my disks problems.

The first problem is space; I allocated 75GB of space for the chroots partition, which uses XFS, after extending it a lot; with a lot of packages missing, I’m reaching for the last 20GB free. I’ll have to extend it more, but to do that I have to get rid of the music and video partitions after moving them to the external drive that Iomega replaced for me (now running RAID1 rather than JBOD; and HFS+ since I want to share it with the laptop if I need the data and Yamato is off). I also will have to get rid of the Time Machine volume I created in my LVM volume group, and start sharing the copy on the external drive; I did that so that the laptop was still backed up while I waited for the replacement disk.

The distfiles directory has reached over 61GB of data, and this does not include most of the fetch-restricted packages, of course I already share it between Yamato’s system and all the chroots (by the way, I currently have it as /var/portage/distfiles, but I’m considering moving it to /var/cache/portage/distfiles since it seems to make more sense; maybe I should propose this to be the actual default in the future, as using /usr for this does not sound kosher to me), like I share the actual synced tree. Still, it is a huge amount of data.

Also, I’m not using in-RAM build, even though I have 16GB of memory in this box. There are multiple reasons for this; the first is that I leave the build run even when I’m doing something else, which might require RAM by itself, and thus I don’t want the two to disrupt themselves so easily, and also, I often go away to watch movies, playing or something while it builds, so I have to look back at the build even a day after; and sometimes colleagues ask me to look at a particular build that might have happened a few days earlier. Having the build on disk helps me a lot here, especially for epatch, eautoreconf and econf logs.

Another reason is that the ELF scan process that scanelf uses is based on memory mapped files, which is very nice when you have to run a series of scanelf calls on the same set of files, since the first run will cache all of them in memory and the others will just to traverse the filesystem to find them. So I want to have as much memory free as I can.

So at the end the disks get to be used a lot, which is not very nice especially since they are the disks that host the whole system for now. I start to fear for their health, and I’m looking for a solution, which does not seem to be too obvious.

First of all, I don’t want to go buying more disks, possibly I’d rather not buy any new hardware for now since I haven’t finished paying for Yamato yet (even though quite a few users contributed, whom I thank once again; I hope they’re happy to know what Yamato’s horsepower is being used for!), so any solution has to be able to be realised using what I have already in house, or need to be funded somehow.

Second, speed is not much of an issue although it cannot be entirely ignored; the build reached sys-power today at around 6pm, and it started last Friday, so I have to assume that a full build, minus KDE4, is going to take around ten days. This is not optimal yet since kde-base makes the ebuild rebuild the same packages over and over switching between modular and monolithic, the solution would be to use binpkgs to cache the rebuilds, which is going to be especially useful to avoid rebuilds on collision-protect failures, and on unmerged packages due to blockers, but that’s going to slow down the build a notch. I haven’t used ccache either, I guess I could have, but I’d have to change the cache directory to avoid resetting the caching I use for my own projects.

So what is my current available hardware?

  • two Samsung SATA (I) disks, 160GB big; they were the original disks I bought for Enterprise, they currently are one in Farragut (which is lacking a PSU and a SATA controller, after I turned it off last year), and one in Klothos (the Sun Ultra 5 with G/FBSD);
  • one Maxtor 80GB EIDE disk;
  • one Samsung 40GB EIDE disk;
  • just one free SATA port on Yamato’s motherboard;
  • a Promise SATA (I) PCI controller;
  • no free PCI slots on Yamato;
  • one free PCI-E x16 slot;

The most logical solution would be to harness the two Samsung SATA disks in a RAID0 software array, and use it as /var/tmp, but I don’t have enough SATA ports; I could set up the two EIDE drives but they are not the same size so RAID0 would be restricted to the 40GB size of the smallest one, which may still be something, since the asneeded chroot’s /var/tmp is currently 11GB.

Does anybody know if a better solution to my problems? Maybe I should be using external drive enclosures or look for small network attached storage systems, but those are things that I don’t have available, and I’d rather not go to buy until I finished paying for Yamato. By itself, Yamato has enough space and power to handle more disks, I guess I could be using a SATA port multiplier too, but I don’t really know about their performance, nor brands or anything, and again would be requiring to buy more hardware.

If I get to have enough money one day, I’m going to consider cabling with gigabit network my garage and set up there a SAN with Enterprise or some other box, a lot of HDDs, and serve them through ZFS+iSCSI or something. For now, that’s a mere dream.

Anyway, suggestions, advices and help about how to reorganise the disk problem are very welcome!