Like a vampire

I have problems with Sun, just a different kind of problem and a different kind of Sun. And I don’t mean I have a problem with the company, Sun Microsystems but rather with some of their products.

The first problem is that, as an upstream they aren’t pleasing to work with. For instance, they changed without notice the file containing the tarball of StudioExpress, just to add some checksumming functionality to make sure the file was properly downloaded before starting. They had the decency of adding a -v2 note on the filename, but it still doesn’t help that they don’t show any changelog for that change, or announce it. I guess somebody resumed looking at security bugs for this to happen since the thing happened almost at the same time as their bug tracker started spamming me many times a day with a notice that some of the bugs I reported has changed details.

The second problem is less of today and more a continuation if a long saga that is actually quite boring. I’ve tried again to get OpenSolaris to work on VirtualBox, but with the new networking support (the vboxnetflt module), the network is tremendously slow, and both NFS and SSH over it as as slot as using them on a 56k modem connection. The main problem being that from time to time the ssh stream freezes entirely, making it quite infeasible to run builds with. Since Solaris, VirtualBox and networking has never been quite that good, and the thing hasn’t improved much now that VirtualBox is developed directly by Sun.

So I decided to use the recently resurrected Enterprise to install OpenSolaris on a real box; the idea was to use the dismissed working disks from Yamato to install not only OpenSolaris but also FreeBSD, NetBSD, DragonFly and other operating systems so I could make sure that the software I work on is actually portable. Unfortunately since I moved to pure SATA (for hard disks at least) a longish time ago, it seems like it’s not that easy: OpenSolaris failed to see any of my disks.

Okay so today I finally took the time to look up an EIDE disk and set it up, I start the OpenSolaris live CD and ask for install. And again it fails to find the hard disks; I would have thought it would be a problem with the motherboard, if it wasn’t that using SysRescueCD I get everything exactly as it should be. Which is more than I can say of my MacBook Pro, whose logic board seems to be bad, and can’t find its own hard drive any longer. I’m waiting to know how much money would it cost me to repair it and then I’ll take it to be repaired (unless it is way too much). This has been unlucky since I had to buy a new laptop for my mother just last week (the iBook I’ve bought six years ago has a bad hard drive now).

So I still don’t have a working setup where to try OpenSolaris stuff, this is quite not nice since I really would like to have my stuff portable to OpenSolaris as well as Linux. Oh well.

The disk problem

My first full tree build with dependencies, to check for --as-needed support has almost finished. I have currently 1344 bugs open in “My Bugs” search, that contains reports for packages failing to build or breaking with --as-needed, packages failing to build for other reasons, packages with file collision that lack blockers around them (there are quite a lot, even totally unrelated one with the other), and packages bundling internal copies of libraries such as zlib, expat, libjpeg, libpng and so on.

I can tell you, the amount of tree packages not following policies such as respecting user LDFLAGS, not using bundled libraries, and not installing stuff randomly in /usr is much higher than one might hope for.

I haven’t even started filing bugs for pre-stripped packages since I have to check those for being filed already, by either me in a previous run, or by Patrick with his tinderbox or other people as well. I also wanted to check this against a different problem: packages installing useless debug info using split-debug, by not passing -g/-ggdb properly to the build system and thus not including debug information at all. Unfortunately for this one I need much more free space than I have right now on Yamato. And here I start with my disks problems.

The first problem is space; I allocated 75GB of space for the chroots partition, which uses XFS, after extending it a lot; with a lot of packages missing, I’m reaching for the last 20GB free. I’ll have to extend it more, but to do that I have to get rid of the music and video partitions after moving them to the external drive that Iomega replaced for me (now running RAID1 rather than JBOD; and HFS+ since I want to share it with the laptop if I need the data and Yamato is off). I also will have to get rid of the Time Machine volume I created in my LVM volume group, and start sharing the copy on the external drive; I did that so that the laptop was still backed up while I waited for the replacement disk.

The distfiles directory has reached over 61GB of data, and this does not include most of the fetch-restricted packages, of course I already share it between Yamato’s system and all the chroots (by the way, I currently have it as /var/portage/distfiles, but I’m considering moving it to /var/cache/portage/distfiles since it seems to make more sense; maybe I should propose this to be the actual default in the future, as using /usr for this does not sound kosher to me), like I share the actual synced tree. Still, it is a huge amount of data.

Also, I’m not using in-RAM build, even though I have 16GB of memory in this box. There are multiple reasons for this; the first is that I leave the build run even when I’m doing something else, which might require RAM by itself, and thus I don’t want the two to disrupt themselves so easily, and also, I often go away to watch movies, playing or something while it builds, so I have to look back at the build even a day after; and sometimes colleagues ask me to look at a particular build that might have happened a few days earlier. Having the build on disk helps me a lot here, especially for epatch, eautoreconf and econf logs.

Another reason is that the ELF scan process that scanelf uses is based on memory mapped files, which is very nice when you have to run a series of scanelf calls on the same set of files, since the first run will cache all of them in memory and the others will just to traverse the filesystem to find them. So I want to have as much memory free as I can.

So at the end the disks get to be used a lot, which is not very nice especially since they are the disks that host the whole system for now. I start to fear for their health, and I’m looking for a solution, which does not seem to be too obvious.

First of all, I don’t want to go buying more disks, possibly I’d rather not buy any new hardware for now since I haven’t finished paying for Yamato yet (even though quite a few users contributed, whom I thank once again; I hope they’re happy to know what Yamato’s horsepower is being used for!), so any solution has to be able to be realised using what I have already in house, or need to be funded somehow.

Second, speed is not much of an issue although it cannot be entirely ignored; the build reached sys-power today at around 6pm, and it started last Friday, so I have to assume that a full build, minus KDE4, is going to take around ten days. This is not optimal yet since kde-base makes the ebuild rebuild the same packages over and over switching between modular and monolithic, the solution would be to use binpkgs to cache the rebuilds, which is going to be especially useful to avoid rebuilds on collision-protect failures, and on unmerged packages due to blockers, but that’s going to slow down the build a notch. I haven’t used ccache either, I guess I could have, but I’d have to change the cache directory to avoid resetting the caching I use for my own projects.

So what is my current available hardware?

  • two Samsung SATA (I) disks, 160GB big; they were the original disks I bought for Enterprise, they currently are one in Farragut (which is lacking a PSU and a SATA controller, after I turned it off last year), and one in Klothos (the Sun Ultra 5 with G/FBSD);
  • one Maxtor 80GB EIDE disk;
  • one Samsung 40GB EIDE disk;
  • just one free SATA port on Yamato’s motherboard;
  • a Promise SATA (I) PCI controller;
  • no free PCI slots on Yamato;
  • one free PCI-E x16 slot;

The most logical solution would be to harness the two Samsung SATA disks in a RAID0 software array, and use it as /var/tmp, but I don’t have enough SATA ports; I could set up the two EIDE drives but they are not the same size so RAID0 would be restricted to the 40GB size of the smallest one, which may still be something, since the asneeded chroot’s /var/tmp is currently 11GB.

Does anybody know if a better solution to my problems? Maybe I should be using external drive enclosures or look for small network attached storage systems, but those are things that I don’t have available, and I’d rather not go to buy until I finished paying for Yamato. By itself, Yamato has enough space and power to handle more disks, I guess I could be using a SATA port multiplier too, but I don’t really know about their performance, nor brands or anything, and again would be requiring to buy more hardware.

If I get to have enough money one day, I’m going to consider cabling with gigabit network my garage and set up there a SAN with Enterprise or some other box, a lot of HDDs, and serve them through ZFS+iSCSI or something. For now, that’s a mere dream.

Anyway, suggestions, advices and help about how to reorganise the disk problem are very welcome!

A possible solution out of Farragut’s trouble

Yesterday I tried to fix a few of the issues with ~sparc-fbsd lagging behind (expected, considering that only me and Roy are handling it); unfortunately the PSU seems to be running quite hot, so when I started it again today to try a patch Ulrich gave me to try fixing emacs-cvs-23, it started making bad smell.

So a possible solution out of this mess for me would be to take Klothos to an undefined break, taking out its Promise SATA controller and putting it into Farragut, so I can connect the 160GB SATA spare drive I have; I actually need a noise dampener, but those are cheap, and I’ll make an order for them soon anyway.

For Klothos to come back up, I’ll have to wait so I can buy a new PSU for it, with active PFC and possibly less noisy.. not sure how much time will that take. In the mean time Roy (who’s currently on his honeymoon) is the only reference for ~sparc-fbsd keywording.

Anyway for those interested, yesterday it was quite a spree of keywording for ~x86-fbsd and somewhat for ~sparc-fbsd (fixed the most prominent problems with the latter, not everything though :( ).

Today I should probably update the Amarok’s maintainer guide, since although nothing changed since I left till I returned, I changed some stuff myself already (and that’s why the live version is now 9999-r2), and on that note I should test PyQt on G/FBSD to fix the new .badindev on Amarok.

So much stuff to do! And this is just a fraction of what I used to do, by the way.