This Time Self-Hosted
dark mode light mode Search

The disk problem

My first full tree build with dependencies, to check for --as-needed support has almost finished. I have currently 1344 bugs open in “My Bugs” search, that contains reports for packages failing to build or breaking with --as-needed, packages failing to build for other reasons, packages with file collision that lack blockers around them (there are quite a lot, even totally unrelated one with the other), and packages bundling internal copies of libraries such as zlib, expat, libjpeg, libpng and so on.

I can tell you, the amount of tree packages not following policies such as respecting user LDFLAGS, not using bundled libraries, and not installing stuff randomly in /usr is much higher than one might hope for.

I haven’t even started filing bugs for pre-stripped packages since I have to check those for being filed already, by either me in a previous run, or by Patrick with his tinderbox or other people as well. I also wanted to check this against a different problem: packages installing useless debug info using split-debug, by not passing -g/-ggdb properly to the build system and thus not including debug information at all. Unfortunately for this one I need much more free space than I have right now on Yamato. And here I start with my disks problems.

The first problem is space; I allocated 75GB of space for the chroots partition, which uses XFS, after extending it a lot; with a lot of packages missing, I’m reaching for the last 20GB free. I’ll have to extend it more, but to do that I have to get rid of the music and video partitions after moving them to the external drive that Iomega replaced for me (now running RAID1 rather than JBOD; and HFS+ since I want to share it with the laptop if I need the data and Yamato is off). I also will have to get rid of the Time Machine volume I created in my LVM volume group, and start sharing the copy on the external drive; I did that so that the laptop was still backed up while I waited for the replacement disk.

The distfiles directory has reached over 61GB of data, and this does not include most of the fetch-restricted packages, of course I already share it between Yamato’s system and all the chroots (by the way, I currently have it as /var/portage/distfiles, but I’m considering moving it to /var/cache/portage/distfiles since it seems to make more sense; maybe I should propose this to be the actual default in the future, as using /usr for this does not sound kosher to me), like I share the actual synced tree. Still, it is a huge amount of data.

Also, I’m not using in-RAM build, even though I have 16GB of memory in this box. There are multiple reasons for this; the first is that I leave the build run even when I’m doing something else, which might require RAM by itself, and thus I don’t want the two to disrupt themselves so easily, and also, I often go away to watch movies, playing or something while it builds, so I have to look back at the build even a day after; and sometimes colleagues ask me to look at a particular build that might have happened a few days earlier. Having the build on disk helps me a lot here, especially for epatch, eautoreconf and econf logs.

Another reason is that the ELF scan process that scanelf uses is based on memory mapped files, which is very nice when you have to run a series of scanelf calls on the same set of files, since the first run will cache all of them in memory and the others will just to traverse the filesystem to find them. So I want to have as much memory free as I can.

So at the end the disks get to be used a lot, which is not very nice especially since they are the disks that host the whole system for now. I start to fear for their health, and I’m looking for a solution, which does not seem to be too obvious.

First of all, I don’t want to go buying more disks, possibly I’d rather not buy any new hardware for now since I haven’t finished paying for Yamato yet (even though quite a few users contributed, whom I thank once again; I hope they’re happy to know what Yamato’s horsepower is being used for!), so any solution has to be able to be realised using what I have already in house, or need to be funded somehow.

Second, speed is not much of an issue although it cannot be entirely ignored; the build reached sys-power today at around 6pm, and it started last Friday, so I have to assume that a full build, minus KDE4, is going to take around ten days. This is not optimal yet since kde-base makes the ebuild rebuild the same packages over and over switching between modular and monolithic, the solution would be to use binpkgs to cache the rebuilds, which is going to be especially useful to avoid rebuilds on collision-protect failures, and on unmerged packages due to blockers, but that’s going to slow down the build a notch. I haven’t used ccache either, I guess I could have, but I’d have to change the cache directory to avoid resetting the caching I use for my own projects.

So what is my current available hardware?

  • two Samsung SATA (I) disks, 160GB big; they were the original disks I bought for Enterprise, they currently are one in Farragut (which is lacking a PSU and a SATA controller, after I turned it off last year), and one in Klothos (the Sun Ultra 5 with G/FBSD);
  • one Maxtor 80GB EIDE disk;
  • one Samsung 40GB EIDE disk;
  • just one free SATA port on Yamato’s motherboard;
  • a Promise SATA (I) PCI controller;
  • no free PCI slots on Yamato;
  • one free PCI-E x16 slot;

The most logical solution would be to harness the two Samsung SATA disks in a RAID0 software array, and use it as /var/tmp, but I don’t have enough SATA ports; I could set up the two EIDE drives but they are not the same size so RAID0 would be restricted to the 40GB size of the smallest one, which may still be something, since the asneeded chroot’s /var/tmp is currently 11GB.

Does anybody know if a better solution to my problems? Maybe I should be using external drive enclosures or look for small network attached storage systems, but those are things that I don’t have available, and I’d rather not go to buy until I finished paying for Yamato. By itself, Yamato has enough space and power to handle more disks, I guess I could be using a SATA port multiplier too, but I don’t really know about their performance, nor brands or anything, and again would be requiring to buy more hardware.

If I get to have enough money one day, I’m going to consider cabling with gigabit network my garage and set up there a SAN with Enterprise or some other box, a lot of HDDs, and serve them through ZFS+iSCSI or something. For now, that’s a mere dream.

Anyway, suggestions, advices and help about how to reorganise the disk problem are very welcome!

Comments 7
  1. well, i’d be adding the 2 eide to the lvm volume and use them as distfiles disks. also the usage of the 2 disks would be less stressy (considering eide are surely older) this way and since you just need to be able to read after one write and eventually the distfiles could always be downloaded again if one of them fails. also for distfiles you shouldn’t bother about speed very much and if you want you could also attach the 160 sata. it could take you a little to setup everything in the proper way with mounts and so on but after that the whole stuff would get you 280gb out of your disks. of course you won’t have the raid-0 speed but if it’s not strictly needed i don’t see why forcing prevent imminent failures set up smartd tests on the eide and monitor them during the time.

  2. Wow, that’s some thankless work — so.. thank you :)!RAID-0 is just striping, so if Linux RAID doesn’t let you use two differently sized disks(?!) LVM should if you assign them to the same Volume Group.1.5TB disks with 32MB cache are only around $130USD here. Perhaps we should start a fund to get you a pair of these for RAID-1?

  3. 1) Looking at the HD’s you’re listing I doubt very much that putting the old IDE-disk in Raid0 would be faster in real-life then a single modern SATA-drive. Simply because the throughput of those new Samsung discs is pretty good ( >100MB/s at the start of the disk, dropping to ~60MB/s at the end).So bottomline is that I doubt that it is worth spending your time on that seeing that those Samsung Spinpoint F1 come in a price-range 60-100 euro depending on the size, with 1TB for 100 Although I’m now officially an ex-gentoo user I’m happy to contribute if you decide to buy a new HD.

  4. Re the SATA port multiplier as I’ve been dreaming about them for a RAID-6 config: Given SATA-300, even the fastest (non SDD) disks today are only about half that speed once you’re actually doing I/O thru the cache to the disk, so you can port-multiply at least 2:1. And that’s at fastest speed. As the drives fill up or with partitions toward the end of the drive they’ll be much slower. Thus, a 4:1 ratio isn’t entirely unreasonable with slower drives or toward the end of the faster ones.As it happens, that matches up reasonably well with the somewhat (relatively) common 5-way and 5:1 port multipliered SATA external drive enclosures now available. If you run some level of RAID with redundancy, you can leave the fifth slot as a hot-spare, giving you a four-spindle (4-active-spindle) external hard drive box, self-powered and connected using a single eSATA cable. The four “live” disks should reasonably well match the bandwidth of eSATA at 300 MB/sec. They’d probably saturate it at the beginning of the disk under best conditions, but it should be a decent real-world match, not overly bottle-necked.At least here in the US (and I read in Taiwan and Japan the prices are cheaper yet), has drives, $38/quarter-T (WD branded, shipped US, a bit cheaper generic, of course), so ~$200 for the five, and last I looked, perhaps a year ago, the 5-way port-multipliered external enclosures ran a bit under $400 (FWIW a bit under $300 w/o the multiplier, which would be 5 eSATA cables, the multiplier ~$130 separately). Thus, outfitted with 5 quarter-T drives, the cost is ~$600, maybe less if the prices on the enclosure w/ multiplier have come down.But if you were to do that you’d need a SATA/300 card as well, since your current setup is only SATA/150 (according to the wikipedia sata entry, it’s NOT accurate to refer to SATA/150 and SATA/300 as SATA-I and SATA-II respectively, tho the public has tended to do so).That should give you some numbers to play with. From previous blog entries I’ve seen your prices there are somewhat higher (or to put it a different way, you don’t get full exchange rate value), but I see some of the dealers now advertising international shipping too, and given the prices I’ve seen you quote, it may well be worth checking it out, even if you have to pay extra shipping and perhaps customs fees and have to wait a bit longer. If you do, please do let me know (a blog about it, or you have my email) if it was worth your time, as I really don’t know, but I do think it’s at least worth checking.(I preview most places, but the preview here doesn’t seem to work, probably needing scripting enabled while it isn’t here, by default. But from past experience, submit works without scripting. So I post without preview and hope it’s fine…)

  5. Okay so today the build finished and I looked through the comments, first of all, thanks to everybody! :)I tried putting the two drives on, and I guess it’s not really much of an option. I’m testing them now in a raid0 configuration but it does not really work this well, the build is not faster, is probably slower. Also, they require so much power I ended up having problems with the network cards at boot. I’ll probably disable them now that I tested them, the only boring part is that I have to tear Yamato a part to remove them since I have to get and remove them through the frontal pieces.The noise they make (the Samsung one has an extra fan by itself since it otherwise would fail too easily, and I added an extra fan to the case too, to increase airflow since it started to get hot) is already bad enough that I cannot keep Yamato turned on this way all the night, leave alone being able to work in the same room. Plus, one of the two drives is “whistling”. Having a software RAID0 /var/tmp is a nice idea though.So now the choices, buying new internal HDs is an option, not much of an expensive one, but still a bit risky to a certain point because I cannot add more than one extra SATA drive, and up to now I’ve been using disks as pairs.The alternative is the enclosure but that’s _very_ expensive for me, $600 would probably translate to €500 at least. Indeed the only eSATA stuff I find on my usual suppliers is full disks like WD’s MyBooks. I also don’t know if the standard internal SATA 300 interface is good enough for eSATA.Another alternative is getting an additional PCI-E controller, like the Promise FastTrak 4650 which can hosts four more disks and thus would allow me to update to nine disks, which can easily for space and performance; too bad the card alone costs €120 so we’re back to square one.Tomorrow I’ll try the Samsung 160GB disk which should be also healthier than the two EIDEs I got here, at least as a contingency measure, and pray that the disk don’t start to fail. I should learn to set up smartd I guess.If you feel like you want to help, donations would still go to pay for Yamato, and the sooner Yamato is fully paid the sooner I can think of adding more hardware!

  6. Turn on FEATURES=test and try a tree build sometime. You’ll be the first person with a 10000 bug count.

  7. Diego, if we can find somebody who can pass you a small parcel from Moscow (Russia) I’d gladly sent you a (couple of) 400GB SATA 3Gbps disks and maybe a PCI-E controller for them. Can you suggest some kind of webboard to post a message to reach one of your neighbours who is visiting Russia soon?As for smartd – (as Google has found it the hard way) it can not reliably predict drive failure. There’s no substitute for a. backup (preferably geographically dispersed) b. recovery plan.While writing the last stance I’ve realized that I should probably offer you some backup space – but unfortunately I do not feel myself in a position to that reliably.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.