This Time Self-Hosted
dark mode light mode Search

How much the tinderbox is worth

After some of the comments on the previous post explaining the tinderbox I’ve been thinking more about the idea of moving it to a dedicated server, or even better making it an ensemble of dedicated servers, maybe trying to tackle more than just one architecture, as user99 suggested.

This brings up the problem of costs, and the worth of the effort itself. Let’s start from the specifics: right now the tinderbox is running on Yamato, which is my main workstation, it’s a bi-quad Opteron, total 8×2.0GHz CPUs, 16GB of registered ECC RAM, and over 2TB of storage, connected to the net from my own DSL line which is not that stable nor fast. As I said the main bottleneck is the I/O, rather than the CPUs, although when packages have proper parallel build systems, the multiple CPUs work quite well. Not all resources are dedicated to the tinderbox effort as they are: storage space especially is just partially given to the tinderbox as it doesn’t need it that much. I use this box for other things beside Tinderboxing, some related to Gentoo, other to Free Software in general, and others finally related to my work or my leisure.

That said, looking through the offers of OVH (which is what Mauro suggested me before, and seem to have most friendly staff and good prices), a dedicated server that has more or less the same specifics as Yamato costs around €1800/year. It’s definitely too much for me to invest to the sole Tinderbox project, but it’s not definitely too much (getting the hardware would cost me more, and this is outsourcing all the hardware maintenance problems). Considering two boxes, so the out-of-tinderbox resources could also be shared (one box to hold the distfiles and tree, the other to deal with the logs), it would be €3600/year, and the ability to deal with both x86 and amd64 problems. Again, too much for me alone, but not absolutely too much.

Let me consider how resources are actually used right now, one by one.

The disk on-disk space used by the Tinderbox is relatively difficult to properly evaluate: I use separated but shared partitions for the distfiles, the build directories and the installed system. For what concerns the distfiles, and the tree, I could get some partial result from the mirrors’ statistics; right now I can tell you that 127GB is the size of the distfiles directory on Yamato. The installed system is instead 73GB (right now) while the space needed for the build directories never went over 80GB (which is the maximum size of the partition I use), with the nasty exception of qemu, as it fills the directory entirely. So in general, it doesn’t need that much hard disk space.

Network traffic is not much of a problem either I’d say: beside the first round of fetching all the needed distfiles, I don’t usually fetch more than 1GB/sync (unless big stuff like a new KDE4 release is handled). This would also be vastly made moot if the internal network had its own Gentoo mirror (not sure if that’s the case for OVH, but I’ll get to that later).

So the only problem is CPU and I/O usage, which is what a dedicated server is all about, no problem there I guess at all, so whoever would end up hosting the (let’s assume) two tinderboxes would only have to mind the inter-box traffic, which is usually also not a problem if they are physically on the same network. I looked into OVH because that’s what I was suggested; I also checked out the prices for Bytemark which is already a Gentoo sponsor, but at least the price to the public is entirely another league. Ideally, given it’s going to be me to invest the man-hours to run the tinderbox, I’d like for the boxes to be located in Europe rather than in America, where as far as I know most of Gentoo’s current infrastructure is; if you have any other possible option you can share, I’d very much like to compare first of all the prices to the public for various providers, given a configuration in this range: LXC-compatible kernel, bi-quad CPU, with large cache, 16G RAM minimum, 500GB RAID1 storage (backup is not necessary).

Now, I said that I cannot afford even to pay for one single dedicated server for the tinderbox, why am I pondering about this? Well, as many asked before “Why is this not on official Gentoo infra?“ is a question I’m not sure how to answer, last I knew infra wasn’t interested in this kind of work. On the other hand even if it’s not proper infra, I’d very much like to have some numbers, to propose the Gentoo Foundation for paying for the efforts. This would allow to extend the reach of the tinderbox, without having me praying for help every other day (I would most likely not stop using Yamato for tinderboxing, but two more instances would probably help).

Also, even if the Foundation wouldn’t have directly the kind of money to sustain this for a long period, it might still be better to have them to pay for it sustained by users’ donations. I cannot get that kind of money clearing through my books directly, but the Gentoo Foundation might help for that.

So it is important, in my opinion, to have a clear figure, and objective, on the kind of money that it’d be costing. It would also help to have some kind of status “Tinderboxes covered to run for X months, keep them running”.

And before somebody wonders: this is all my crazy idea, I haven’t even tried to talk with the Foundation yet, I’ll do so once I can at least present some data to them.

Comments 17
  1. Harris, as far as I can see, EC2 works “per hour“ for almost all options, which makes it quite unusable for the Tinderbox: it runs 24/7, and it needs a lot of power to be dedicated to that task, and very little bandwidth… it makes a good call for a dedicated server, but not for any virtual solution, including EC2, I’d say.

  2. OVH has an internal Gentoo mirror.They also provide default Gentoo installs (though not always up do date).However, what about SSDs or RAID0+1?

  3. Diego,Have you talked to Mark (Halcy0n) recently? Both you and him share similar goals. His being to get a tinderbox for every arch. He even recently secured at least one ia64 box for developer use. Be sure to hook up with him before requesting Foundation $$. 🙂 IIRC, guppy is the latest one he secured, which looks like it is plenty powerful enough to compile all day long, and you aren’t paying the powerbill. :)I feel like even x86/amd64 boxes would be even easier to find a sponsor. Or a cheaper route would be for the foundation to buy a new box that is colo’d at a sponsor site (OSL fex)…And fwiw, infra could probably secure a few more machines if there were use cases for such. Now, the bottleneck is admins, not sponsors.Good luck.

  4. Oh, my previous comment was about doing it remotely. I didn’t see the comment on your previous post, being: “The one big issue for which I don’t really feel comfortable with running the tinderbox remotely is easy access to the logs for identifying problems and, especially, to attach them to bug reports.”So, I apologize, for not reading thoroughly, but this post made it sound like you were/are looking for remote options. 🙂

  5. Jeremy, Mark’s work is being carried out as an extension of what I’m doing myself, in the sense that he’s using my scripts, so yeah don’t worry I am in contact and in sync with him.And yes I’m now looking for remote options if I can get a better handling of logs that don’t require me to have all the 200+MB of logs at hand when I’m reporting bugs. Although that’s not yet complete, focusing on that might be more important if there is chance of getting other boes to do the work.As for co-locating the box at OSL, the problem would, as I said, mostly be the distance, from Europe to OSL I don’t get that fast of a connection and thus I’d be pretty much slowed down. Given I’m going to be the one pouring the man-hours to get the logs converted into usable bug reports (which I can ensure you is nowhere near easy), I’d rather have the box near me rather than at OSL.And the admins bottleneck is the reason why I’d like to have a pricetag for a dedicated server rather than a co-located one: you don’t have to take care of hardware, setup and stuff like that. Since it doesn’t end up providing a strict service to Gentoo it shouldn’t really matter whether it allows LDAP auth or whatever else, in my opinion. And just having a couple of people handling down the updates should be enough in that sense.@pankakke: good to hear! That is going to be definitely one of the important notes for (eventually) selecting the provider!

  6. Your response is understood. Just bouncing some ideas around. 🙂 OSL was just an example, there are sponsors closer to you as well. UK, Germany, France, Amsterdam.Look forward to your “pricetag” analysis. It is an interesting topic to me. Also, interesting, is selecting a dedicated hosting provider. I have looked at many and never liked the choices.

  7. Have you thought about approaching OVH directly? They use gentoo a lot internally and I wouldn’t be surprised if Octave (the boss of OVH) let you have a box or two at reduced rates or maybe even for free.

  8. Diego, why not the kimsufi dedicated servers from OVH ? as you just require CPU/IO power they are cheaper than the normal dedicated server; they give you more space (you get 4 HD instead of 1) and a better RAID hardware controller: you get a raid 0/1/5 controller, so you can do a raid 0 with all the four HD for maximum performance (but in case of damage you must re-install everything).the i7-4T model just costs 720€/year instead of 1800€/year (and if you obtain the 1800€/year founds you’ll be able to buy *two* i7-4T, acquiring more CPU power than a single bi-quad dedicated server).as pointed out by @rosbif, why not involve OVH in the idea together with the Gentoo Foundation? I know that OVH is a huge fun of the Gentoo distro and I’m quite sure that they can give us some free dedicated servers in exchange of some Gentoo Foundation services:1. free advertising on improve the OVH services releated to Gentoo with the help of the Gentoo developers?p.s.: i’m happy to see that my initial suggestion of “dedicated server payed by the Gentoo Foundation” has been considered.

  9. well, just forgot my previous sentence about the RAID0 with the 4 HD, because the i7-4T is carried with a 3Ware controller, so your I/O will never hit the CPU; you’ll gain faster I/O performance than a normal SATA2 controller without the need of the RAID features and definitively fixing the I/O bottleneck for free (RAID features are not free on OVH).

  10. wouldn’t it be possible to have a local version of the /var/log and/or /var/tmp/portage itself compressed and synced once a day and retrieved to local storage for your own use? Even from multiple boxes? That would remove your objection to distance?

  11. @joe user: Hetzner is a low quality service and you must pay the bandwidth traffic and a lot of the extra fee (which are free for OVH).

  12. If you could devise a control scheme that didn’t require direct access you could have hundreds of tinderboxes.Off the top of my head, a .profile that listed required resources x, required time n, followed by scripted instructions to bring the machine into the ‘start’ state, instructions for what to build, in what order. I’m suggesting this because if I put an effort into it I could provide a few different hardware configs but I could not at all offer remote access. I would be shocked if there weren’t dozens of people who could do the same if you didn’t require direct access.If what I described above existed you would probably find within the ‘group’ several people you trust to do competent debug, and whatever resources you have direct access to (by rent or whatever means) could be dedicated to reproducing/resolving already found problems.You wouldn’t have to come up with some bullet proof scheme out of the chute, you limit initial participation to people with a minimum skill set, hours to volunteer etc. Maybe 5 or 6 people that could help with find a solution to the log analysis problem. If it takes 6 months or a year (or two) for the abstraction(s) to get ironed out enough to open up participation, so what? You’ve been doing this for maybe 5 years now, I’d say there’s a good chance you’ll still feel tinderbox is useful two years from now.Just an idea, I’m not insulted if you don’t bother explaining why you don’t like it -I mean don’t spend minutes that are better spent elsewhere.Rob

  13. @equilibriumThat’s wrong, traffic up to 1TB is inclusive and diego realy don’t need the pay-for-extras. also they’re providing higher quality than ovh and they’re taking into account the whishes of their customers and if possible put them into practise.we are talking about dedicated servers, not housing and even in housing hetzner is much cheaper than ovh…and last but not least ovh had more than once bigger peering problems outside of france…

  14. @user99 the tinderbox produces something around 300MB of logs a week, it’s not really feasible to download all of them for analysis: it has to be done locally, where the config.log and other diagnostics logs are found.@Joe @equilibrium while I can probably get the tinderbox to work on smaller CPUs (the i7 is somewhat smaller from some points of view, since a Quad HT is not the same as a BiQuad) but both memory and I/O speed are important. Kimsufi seems more interesting than Hetzner, but I’m not sure about either right now to be honest.@rlarkin as I “said before”:… there is no space to make the tinderbox distributed: it requires all the packages to be together. Just replicating the same environment on dozens of boxes is just going to waste time because the same exact bugs will be found and reported all over, and I’m pretty sure half the people wouldn’t be able to understand what the bugs are and how to solve them.One of the things that made the tinderbox work up to now is that I know most of the times where the bugs go and how to fix them; I’ve not been doing it for five years, but rather one and a half, but I don’t think that just throwing away everything for two years to get a new “abstraction“ in it is going to do any good. A Google SoC project was there to produce “the final CI tool for Gentoo“ but it has produced nothing as far as I can see, I’d rather keep doing it myself than waiting for somebody else to come up with a solution because *nobody has done that in quite a while*!

  15. Regarding Mark’s tinderbox:Too bad he didn’t announce anything or whats he’s doing. TBH its a bit sad we were only able to get a box for an architecture almost nobody uses since its so expensive, and that it has only one dev(me) apart from vapier).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.