Sharing distfiles directories, some lesser known tricks

I was curious to see why there was much noise coming from the Gentoo Wiki LXC page to my blog lately, so I decided to peek at the documentation there once again, slightly more thoroughly; while there are a number of things that could likely be simplified (but unless I’m paid to do so I’m not going to work on documenting LXC until it’s stable enough upstream; as it is it might just be a waste of time since they change the format of configuration files every other release), there is one thing that really baffled me:

If you want to share distfiles from your host, you might want to look at unionfs, which lets you mount the files from your host’s /usr/portage/distfiles/ read-only whilst enabling read-write within the guest.

While unionfs is the kind of feature that goes quite well with LXC, for something as simple as that is quite an overkill. On the other hand I remembered that I also had to ask about this to Zac, so it might be a good idea to document this a moment.

Portage already allows you to share a read-only distfiles directory.

You probably remember that I originally started using Kerberos because I wanted some safe way to share a few directories between Yamato and the other boxes. One of the common situations I had was the problem of sharing the distfiles, since quite a few are big enough that I’d rather not download them twice. Well, even with that implemented, I couldn’t get Portage to properly access them, so I looked for alternative ways. And there is a variable that solves everything: PORTAGE_RO_DISTDIRS.

How does that work? Simply mount some path such as /var/cache/portage/distfiles/remote as the remote distfiles directory, then set PORTAGE_RO_DISTDIRS to that value. If the required files are in the read-only directory, and not in the read-write one, Portage will symlink them and ensure access on a stable path. Depending on your setup, you might want to export/import the directory as NFS, or – in case of LXC – you might want to set it up as a read-only bind mount. Whatever you choose, you don’t need kernel support for unionfs to use a read-only distfiles directory. Okay?

Gentoo, a three-headed dog, and me — Kerberos coming to PAM

I’ve been fighting the past few days with finding a solution to strengthen the internal security of my network. I’m doing this for two main reasons; from one side, having working IPv6 on the network means that I have to either set up a front-end firewall on the router, or I have to add firewalls on all the devices, and that’s not really so nice to do; on the other side, I’d like to give access to either containers (such as the tinderbox) or other virtual machines to other people, developers for Gentoo or other projects.

I’m not ready to just give access to them as the network is because some of the containers and VMs have still password-login, and from there, well, there would be access to some stuff that is better kept private. Even though I might trust some of the people I’m thinking to give access to, I won’t trust anybody else’s security practice with accessing my system. And this is even more critical since I have/had NFS-writeable directories around the system, including the distfiles cache that the tinderbox works with.

Unfortunately, most of the alternatives I know for this only work with a single user ID, and that means among other things that I can’t use them with Portage. So I decided to give a try to using NFSv4 and Kerberos. I’m not sure if I’ll stick with that sincerely, since it makes the whole system a whole lot more complex and, as I’ll show in a moment, it’s also not really solving my problem at its root, so it’s of little use to me.

The first problem is that the NFSv4 client support in Gentoo seems to have been totally broken up to now, bug #293593 was causing one of the necessary services to simply kill itself rather than running properly, and it was fun to debug. There is a library (libgssglue) that is used to select one out of a series of GSS API providers (either Kerberos or others); interestingly enough, this is yet another workaround for Linux missing libmap.conf and the problem with multiple implementations of the same interface. This library provides symbols that are also provided by the MIT-KRB5 GSS API library (libkrb5_gssapi); when linking the GSS client daemon (rpc.gssd) it has to link both, explicitly causing symbol collisions, sigh. Unfortunately this failed for two reasons again: .la files for libtirpc (don’t ask) caused libtool to actually reorder the linking of libraries, getting the wrong symbols in (bad, and shows again why we should be dropping those damn files), plus there was a stupid typo in the configure.ac file for nfs-utils where instead of setting empty enable_nfsv41 variable they set enable_nfsv4, which in turn caused libgssglue from not being searched for.

The second problem is that right now, as I know way too well, we have no support for Kerberos in the PAM configuration for Gentoo, this is one of the reason why I was considering more complex PAM configurations — main problem is that most of the configurations you find in tutorials, and those that I was proposed, make use of pam_deny to allow using either pam_unix or pam_krb5 at the same time; this in turn breaks the proper login chain used by the GNOME Keyring for instance. So I actually spent some time to find a possible solution to this. Later today when I have some extra time I’ll be publishing a new pambase package with Kerberos support. Note: this, and probably a lot more features of pambase, will require Linux-PAM. This is because the OpenPAM syntax is just basic, while Linux-PAM allows much more flexibility. Somebody will have to make sure that it can work properly on FreeBSD!

There is also a request for pam_ccreds to cache the credentials when running offline, I’m curious about it but upstream does not seem to be working on it as much as it should, so I’m not sure if it’s a good solution.

Unfortunately, as I said, NFSv4 does not seem so much of a good solution; beside the still lack of IPv6 support (which would have been nice to have, but it’s not required for me), if I export the distfiles over NFSv4 (with or without Kerberos), the ebuild fetch operation remain stuck in D-state for the process (blocked on I/O wait). And if I try to force the unmount of the mounted, blocked filesystem, I get the laptop to kernel panic entirely. Now, to make the thing easier to me I’m re-using a Gentoo virtual machine (which I last used for writing a patch for the SCTP support in the kernel) to see if I can reproduce the problem there, and get to fix it, in one way or another.

Unfortunately I’ve spent the whole night working and trying to get this working, so now I’ll try to get some rest at least (it’s 9.30am, sigh!). All the other fixes will wait for tomorrow. On the other hand, I’d welcome thank yous if you find the help on Kerberos appreciated; organisations who would like to have even better Gentoo support for Kerberos are welcome to contact me as well…

LXC and why it’s not prime-time yet

Lately I got a number of new requests about the status of LXC (Linux Containers) support in Gentoo; I guess this is natural given that I have blogged a bit about it and my own tinderbox system relies on it heavily to avoid polluting my main workstation’s processes with the services used by the compile – and especially test – phases. Since a new version was released on Sunday, I guess I should write again on the subject.

I said before that in my opinion LXC is not ready yet for production use, and I maintain that opinion today. I would also rephrase it in something that might make it easier to understand what I think: I would never trust root on a container to somebody I wouldn’t trust root with on the host. While it helps a great deal to reduce the nasty effects of an application mistakenly growing rogue, it neither removes the option entirely, nor it strengthen the security for intentional meddling with the system. Not alone at least. Not as it is.

The first problem is something I have already complained about: LXC shares the same kernel, obviously and by design; this is good because you don’t have to replicate drivers, resources, additional layers for filesystem and all the stuff, so you have real native performance out of it; on the other hand, this also means that if the kernel does not provide namespace/cgroup isolation, it does not allow you to make distinct changes between the host system and the container. For instance, the kernel log buffer is still shared among the two, which causes no little problems to run a logger from within the container (you can do so, but you have to remember to stop it from accessing the kernel’s log). You also can’t change sysctl values between the host and the container, for instance to disable the brk() randomizer that causes trouble with a few LISP implementations.

But there are even more interesting notes that make the whole situation pretty interesting. For instance, with the latest release (0.7.0), networking seems to have slightly slowed down; I’m not sure what’s the problem exactly, but for some reason it takes quite a bit longer to connect to the container than it used to; nothing major so I don’t have to pay excessive attention to it. On the other hand, I took the chance to try again to make it work with the macvlan network rather than the virtual Ethernet network, this time even googling around to find the solution about my problem.

Now, Virtual Ethernet (veth) is not too bad; it creates a peer-to-peer connection between the host and the container; you can then manage that as you see fit; you can then set up your system as a router, or use Linux ability to work as a bridge to join container’s network with your base network. I usually do that, since it reduces the amount of hops I need to add to reach Internet. Of course, while all the management is done in-kernel, I guess there are a lot of internal hops that have to be passed, and for a moment I thought that might have been slowing down the connection. Given that the tinderbox accesses the network quite a bit (I use SSH to control it), I thought macvlan would be simpler: in that case, the kernel is directing the packets coming toward a specific MAC address through the virtual connection of the container.

But the way LXC does it, it means that it’s one-way. By default, actually, each macvlan interface you create, isolates the various containers one from the other as well; you can change the mode to “bridge” in which case the containers can chat one with the other, but even then, the containers are isolated from the host. I guess the problem is that when they send packets, they get sent out from the interface they are bound to but the kernel will ignore them if they are directed back in. No there is currently no way to deal with that, that I know of.

Actually upstream has stated that there is no way to deal with that right now at all. Sigh.

An additional problem with LXC is that even when you do blacklist all the devices so that the container’s users don’t have access to the actual underlying hardware, it can mess up your host system quite a bit. For instance, if you were to start and stop the nfs init script inside the container.. you’d be disabling the host’s NFS server.

And yes, I know I have promised multiple time to add an init script to the ebuild; I’ll try to update it soonish.

Productivity improvement weekend

This weekend I’m going to try my best to improve my own productivity. Why do I say that? Well there are quite a few reasons for it. The first is that I spent the last week working full time on feng, rewriting the older code to replace it with simpler, more tested and especially well documented code. This is not an easy task especially because you often end up rewriting other parts to play nicely with the new parts; indeed to replace bufferpool, between me and Luca we rewrote almost entirely the networking code.

Then there is the fact that I finally got a possible price to replace the logic board of my MacBook Pro that broke a couple of weeks ago: €1K! That’s almost as much as a new laptop; sure not the same class, but still. In the mean time I bought an iMac; I needed access to QuickTime, even more than I knew before, because we currently don’t have a proper RTSP client; MPlayer does not support seeking, FFplay is broken for a few problems, and VLC also does not behave in a very standard compliant way. QuickTime is, instead, quite well mannered. But this means I have spent money to go on with the job, which is, well, not exactly the nicest thing you can do if you need to pay some older debts too.

So it means I have to work more; not only I have to continue my work on lscube at full time, but I’m going to have to get more jobs to the side; I got asked for a few projects already, but most seem to require me to learn new frameworks or even new programming languages, which means they require a quite big effort. I need the money so I’ll probably pick them but it’s far from optimal. I’ve also put on nearly-permanent hold the idea of writing an autotools guide, either as an open book or a real book; the former has shown no interest among readers of my blog, the latter has shown no interest among publisher. I start to feel like an endangered species regarding autotools, alas.

But since at least for lscube I need to have access to the FFmpeg mailing list, and I need access to the PulseAudio mailing list for another project and so on so forth, I need to solve one problem I already wrote about, purging GMail labels out of older messages. I really have a need for this to be solved, but I’m still not totally in luck. Thanks to identi.ca, I was able to get the name of a script that is designed to solve the same problem: imap-purge . Unfortunately there is a problem with one GMail quirk: deleting a message from a “folder” (actually a GMail label) does not delete the message from the server, it only detach the label from that message; to delete a message from the server you’ve got to move it to the Trash folder (and either empty it or wait for 30 days so that it gets deleted). I tried modifying imap-purge to do that, but my Perl is nearly non-existent and I couldn’t even grok the documentation of Mail-IMAPClient regarding the move function.

So this weekend either I find someone to patch imap-purge for me or I’ll have to write my own script based on its ideas in Ruby or something like that. Waste of time from one side, but should allow me to save time further on.

I also need to get synergy up to speed in Gentoo, there have been a few bugs opened regarding crashes and other problems and requests for startup scripts and SVN snapshots; I’ll do my best to work on that so that I can actually use a single keyboard and mouse pair between Yamato and the iMac (which I called, with a little pun, USS Merrimac (okay I’m a geek). Last time I tried this, I had sme problems with synergy deciding to map/unmap keys to compensate the keyboard difference between X11 and OSX; I hope I can get this solved this time because one thing I hate is having different key layout between the two.

I also have to find a decent way to have my documents available on both OS X and Linux at the same time, either by rsyncing them in the background or sharing them on NFS. It’s easier if I got them available everywhere at once.

The tinderbox is currently not running, because I wouldn’t have time to review the build logs, in the past eight days I turned on the PlayStation 3 exactly twice, one earlier today to try relaxing with Street Fighter IV (I wasn’t able to), and the other time just to try one thing about UPnP and HD content. I was barely able to watch last week’s Bill Maher episode, and not much more. I seriously lack the precious resource that time is. And this is after I show the thing called “real life” almost entirely out of the door.

I sincerely feel absolutely energy-deprived; I guess it’s also because I didn’t have my after-lunch coffee, but there are currently two salesman boring my mother with some vacuum cleaner downstairs and I’d rather not go meet them. Sigh. I wish life were easy, at least once an year.

Filesystems — take two

After the problem last week with XFS, today seems like a second take.

I wake up this morning with a reply about my HFS+ export patch, telling me that I have to implement the get_parent interface to make sure that NFS works even when the dentry cache is empty (which is what caused some issues with iTunes while I was doing my conversion most likely), good enough, I started working on it.

And while I was actually working on it, I find that the tinderbox is not compiling. A dmesg later shows that, once again, XFS had in-memory corruption, and I have to restart the box again. Thankfully, I got my SysRescue USB stick, which allowed me to check the filesystem before restarting.

Now this brings me to a couple of problems I have to solve. The first is that I finally have to switch /var/tmp to its own partition so that /var does not get clobbered if/when the filesystem go crazy; the second is that I have to consider alternatives to XFS for my filesystems. My home is already using ext3, but I don’t need performance there so it does not matter much; my root partition is using JFS since that’s what I tried when I reinstalled the system last year, although it didn’t turn out very good, and the resize support actually ate my data away.

Since I don’t care if my data gets eaten away on /var/tmp (the worst that might happen is me losing a patch I’m working on, or not being able to fetch the config.log for a failed package – and that is something I’ve been thinking about already), I think I’ll try something more “hardcore” and see how it goes, I think I’ll use ext4 as /var/tmp, unless it panics my kernel, in which case I’m going to try JFS again.

Oh well, time to resume my tasks I guess!

The distributed nightmare has ended!

Okay, tonight just a quick post. First of all, totally unrelated with the title (okay still distributed stuff, but not a nightware), GIT was discarded as option for xine-lib’s future Version Control System, we instead decided to go with Mercurial because of its availability on a wider range of platforms (GIT is not officially supported on Solaris at least, and Cogito does not work at 100% on FreeBSD); I also cleared up my previous doubt about it, both speed and features of the two softwares seems to be at the same level, so portability has been the breaking requirement. Darren converted the CVS already, creating a bunch of repository (due to branching), and it also got support for the user ID to author name translation (which means you’ll see the commit coming from Diego ‘Flameeyes’ Pettenò rather than from an anonymous dgp85).

With reference to my distributed nightmare, instead, I want to thank Mike (vapier) a TON, his last version of nfs-utils (1.0.12-r2) finally fixes my problem with FreeBSD clients, so I upgraded Farragut (breaking ServoFlame in the mean time I’m afraid) and now I’m putting Prakesh back into shape so I can start playing with Tomcat again.

So I don’t have to get crazy with Coda, nor I have to buy a new box right now; I’ll eventually do it, as I start to feel this box is slow, but for now it’s fine as it is without more money to go away; this is good because I can then buy the new Yu-Gi-Oh game as soon as it’s released ;)

Anyway, tomorrow it’s working day again, so I should be sleeping, the DST switch threw me off, and I was awake all last night with the jigsaw my sister gave me… 121×80, 3000 pieces.

My distributed nightmare

For Gentoo/FreeBSD I ended up having three extra boxes, besides my main workstation (Enterprise) and my laptop (Intrepid) in my home office, all of which shares a good deal of data, as they all need a Portage tree, a Distfiles directory, a few overlays and a bit of data like the repositories I work with so that I can just update them on one box and build on two or three.

To avoid having to copy the data on all of them (which other than being slow and impractical would also reduce a lot the space on disk I have, and for all of them but the sparc it’s actually quite a bad thing), I used to use NFS to share most of the data between them, after all they are all on the same network segment and that is not a big problem.

Unfortunately, I knew for a while that nfs-utils-1.0.8 and later versions’ server conflicts with FreeBSD’s client. For some reason, it’s returning a lot of «Stale NFS file handle» when accessing more than an handful of files (which means I can’t merge anything, nor I can do my usual daily backups of git and typo). I left the newer version masked locally up to now, but now nfs-utils was broken by a libevent bump, and it didn’t rebuild, not sure why and I don’t feel like wasting my time trying to keep nfs-utils-1.0.7 alive.

So I’ve started looking for alternative solutions, unfortunately is far from being easy.

I first looked for alternative network/distributed filesystems. Unfortunately the requirement is that it works fine on both Linux (server) and FreeBSD (client), which turns away AFS and Luster, both of which are not supported on FreeBSD — okay, AFS is somewhat supported, but a lot of noise about problems doesn’t really make me expect anything good out of it. What I see that could still work is CIFS or CodaFS.

The first I ruled out because I tried it before, and not only it’s far from being practical, is also slow as hell on a 100MBit network. Only the latter is left.

And again, a problem. net-fs/coda is currently available only on PPC and x86. When I tried just building a package of it on AMD64, it failed because the DEPEND/RDEPEND settings are wrong. When I wanted to merge it, two packages failed because of —as-needed related issues, when I finally came to merge net-fs/coda I understood why only those arches had it marked: it’s 64-bit unsafe.

But this is just the tip of the iceberg. Coda and its related packages currently in portage are quite old, compared to the latest versions available on the site. Maybe they already fixed the two —as-needed problems I found and fixed myself tonight, maybe they also fixed the 64-bit uncleanness, as there is a new minor release (6.1 versus 6.0). I’m not sure if Griffon26 just stopped using coda or if there are other reasons not to have it updated in Portage, but in the next days I might take a look to improve the situation in my overlay at least (where you can also find the two —as-needed fixes).

If codafs won’t allow me to get what I need (a central repository that can be off or online in general, but only online when used), I’ll have to find an alternative solution. The easiest route would be to make Farragut also handle the NFS serving, but considering that I use the data mostly from Enterprise while I develop, it’s likely that I can’t allow it to run on an old and slow IDE drive, and also the size of my current share is not going to fit into it. It’s also bad that I’d have to keep all this data in the WebServer.

My only solution here would be to get another box to run as an NFS server and maybe something more, something running FreeBSD and that wouldn’t do much noise so that it can stay online together with Farragut. Unfortunately I don’t think I can afford a new box just yet, maybe in a couple of months.. in the mean time if NFS is broken, I simply won’t upgrade Farragut for a while, and find a different way to send the backups from one box to the other for safety.

Third strike, I’m out for a while…

Seems like Prakesh is not starting its service in the best of ways. As also the second setup, suggested by Javier, failed to work (boot0 refused to find the correct partition to boot from), I’m now at the same starting point.

Supposing a problem with geom modules loading, I spent the night trying to get a custom kernel to build, but for some reasons it tried to re-symlink files that were already symlinked, failing badly.. not sure why, but then I simply gave up and decided to go with a more normal setup, without geom, stripes or anything, just / and /var/tmp on two different disks, so that I can have the catalyst stuff in a separate disk too.

And now, after two install tries that worked more or less fine (from the emerge -e world point at least) …. NFS does not work anymore! Prakesh is able to mount the filesystems exported by enterprise, but as soon as I try to run an ls on them I get a “bad file descriptor” error.

I’ve already tried restarting FreeSBIE, and even the nfs daemon on Enterprise… the next I’ll be restarting enterprise altogether, but I’m pretty clueless about why that’s happening …

Sigh, this is not my week for sure, but I’m afraid it will become not my year, continuing this way.

The lovely NFS…

Or maybe not so lovely. A bit of background to start: for Gentoo/ALT and in general for portability I have here three machines that uses non-Linux operating system, defiant (Solaris 10), farragut (Gentoo/FreeBSD 5.4) and phoenix (OpenBSD 3.something). To share easily ebuilds or code when I’m hacking at it with quilt (needing GNU userland), I make use of a simple /var/portage NFS export where I have the CVS checkouts of the trees, the distfiles and so on. The share is mounted automatically by defiant as it’s used for portage handling.

Yesterday, I found out that after rebooting with updated nfs-utils I was unable to mount the NFS share from defiant. I had no time yesterday to look at that, as when it fails to mount it it also fails to start (yeah I know I should set a limit to the retries, but whatever), but today I wanted to. Seems like it stopped accepting the box as a known one, stating that it was “unknown” host.

I thought of a problem with NFSv4 so I stepped back to nonfsv4 flag and seen with that, but nothing… after looking a bit more deeply, I found that specifying the exact IP of the machine instead of 192.168.. I was using before, it worked. I tried then the alternative 192.168.0.0, nothing again. I resolved putting an explicit 192.168.0.0/16, that seems to work fine. This makes me wonder if the manpage is, then, outdated…

Anyway, the issue is resolved, and if someone had similar problems, maybe this can help him :)

Oh a couple of notes before leaving… thanks to everyone who stepped up for the KDE -arts thing, I was able to resolve with phreak“ anyway, so the problem is gone :)

And thanks also to everyone who helped me last night to find a curriculum class in LaTeX :) yes I’m looking for a job… I’m writing an article -in italian- on portability (more comprehensive than the ones I wrote on NewsForge/Linux.com before) trying to applying with a magazine… wish me luck! :P