GNU’s definitely too messy for my taste

I’m pretty sure not to say something wrong when I say that the majority of the readers of this blog would agree that Open Source and Free Software development can produce better software by many different measures. Better overall quality is just theoretical; you have more portable code, more standard code, faster coder, less memory-hungry code and so on so forth. One of the measures you can figure out is code cleanliness, but even that is very subjective. So this is why I’m talking about taste.

So what’s the problem here? Well, while I think the end result of GNU software is generally very good, I find the general code very messed up and pretty much unreadable and unusable. I have criticised gnulib’s approach before regarding the way they duplicate the same code over and over, rather than provide a “portability” library to use where glibc is not used, and the system libraries don’t provide enough functionalities. Lately, I also noticed that they add tons of redundant autoconf checks to project running them, some of which have been broken from time to time, in different ways (I had to fix diffutils and m4 the other day because they had an automagic dependency over libsigsegv (and that changed its ABI on the tinderbox recently).

A couple of days ago I also had a bad face-to-face encounter with GNU coreutils code. The reason for that is that I’m working on an utility for a customer of mine, and I wanted to re-use the code from the fold command to split a text file into equally-sized lines (of course I could always use fold as a filter, but since the amount of work that I needed to do to get the proper parameters passed to that would be more than the generic work I needed to do my side, I simply wanted to integrate the code). License, here, wasn’t much of an issue to me, the utility I’m writing is not part of the business logic, so I have powers to make it available if the license of parts of the code asked for me to.

Unfortunately, a different problem came up: the code in fold.c is too messy to work with. The source file is not standalone, it depends on a huge tree of header files, each of which then depend on a pile of other sources; and this does not limit itself to the gnulib dependency. Nothing is clear in that code. There is a system.h header, that defines a very wide range of different functions: filesystem handling, integer types, allocation functions and so on so forth. This header in turn depends on over ten other different headers that provide different definitions and so on.

Just to make it worse, the headers define inline functions, that depend on other external functions from other source files; and all of those end up requiring a truckload of autoconf checks that sound, to a minimum, silly for something as basic as fold. I actually tried cleaning up the code, but, well, the work was tougher than reimplementing the folding code altogether.

At the end, I didn’t have to reimplement the algorithm though; the whole multibyte coding would have annoyed the hell out of me. I “folded back” (sorry for the pun) to using FreeBSD’s code. The fold.c file from FreeBSD’s source repository is self-contained, clean, straight through (although, to be honest, I find it lacks a couple of “static” keywords that could have slightly reduced its overhead). The license is also more permissive, as we know.

At the end, I’ll probably try to make the utility open-source anyway, using a similarly permissive license, given that’s what I took part of the code from.

I don’t doubt that GNU’s code might be better in some regards; for instance it’s almost certain that the GNU code builds on more platforms and with more variations than FreeBSD’s, but to do so, it really has to overcomplicate the code to a point that readability is gone for good. Similarly, the GNU utilities tend to have more user-friendly features, with further parameters, but these GNU extensions cause “lock-ins” that mean that standard support in their programs is lacking. This reflects down to many aspects of GNU software, their complexity, their over-engineering, their non-standard extensions. And this is probably one of the reasons why GNU is sometimes frowned upon by other (pragmatic) Free Software developers… and why some people would very much like to stop talking about GNU/Linux.

GNU guys, I understand your projects’ aim, but please, could you refocus? Could you reduce your complexity? Give us a libgnucompat… make it GPL if you don’t want to have it LGPL; but move away the code duplication; move away the complexity in build systems; the complexity of webs of source and header files. Make your code readable again, linear again, make your utilities the best people may come ask for. Please; pretty please.

Virtualisation WTF once again.

To test some more RTSP clients I’ve been working to get more virtual machines available in my system; to do so I first extended the space available in my system by connecting one more half-a-terabyte hard drive (removing the DVD burner from Yamato), and then started again working on a proper init script for KVM/Qemu (as Pavel already asked me before, and provided me with an example).

Speaking about it, if somebody were to send my way an USB or FireWire DVD burner I’d be probably quite happy; while I have other three DVD burners around – iMac, MacBook Pro and Compaq laptop – having one on Yamato from time to time came out useful; not necessary, so wasting a SATA port for it was not really a good idea after all, but still useful.

I started writing a simple script before leaving for my vacation and extended it a bit more yesterday. But in line with the usual virtualisation woes the results aren’t excessively positive:

  • FreeBSD 8 pre-releases no longer seem to kernel panic when run in qemu (the last beta I tried did, the latest rc available does not); on the other hand it does seem to have problems with the default network (it works if started after boot but not at boot); it works fine with e1000;
  • NetBSD still is a desperate case: with qemu (and VDE) no network seem to work; e1000 is not even recognised, while the others end up timing out, silently or not; this is without ACPI enabled, if I do enable ACPI, no network card seems to be detected; with KVM, it freezes, no matter with or without ACPI, during boot up;
  • Pavel already suggested a method using socat and the monitor socket for qemu to shut down the VM cleanly; the shutdown request will cause the qemu or kvm instance to send the ACPI signal (if configured!) and then it would shut down cleanly… the problem is that the method requires socat, which is quite broken (even in the 2-beta branch).

Let me explain what the problem is with socat: its build system tries to identify the size of various POD types that are used by the code; to do so it uses some autoconf trickery, the -Werror switch and relies on pointer comparison to work with two POD types of the same size, even if different. Guess what? That’s no longer the case. A warning sign was already present: the code started failing some time ago when -Wall was added to the flags, so the ebuild strips it. Does that tell you something?

I looked into sanitizing the test; the proper solution would be to use run-test, rather than build-tests, for what I can see; but even if that’s possible, it’s quite intrusive and it breaks cross-compilation. So I went to look why the thing really needed to find the equivalents… and the result is that the code is definitely messy. It’s designed to work on pre-standard systems, and keep compatible with so many different operating systems that fixing the build system up is going to require quite a bit of code hacking as well.

It would be much easier if netcat supported handling of unix local sockets, but no implementation I have used seem to. My solution to this problem is to replace socat with something else; based on a scripting language, such as Perl so that’s as portable, and at the same time less prone to problems like those socat is facing now. I asked a few people to see if they can write up a replacement, hopefully this will bring us a decent replacement so we can kill that.

So if you’re interested in having a vm init script that works with Gentoo without having to deal with stuff like libvirt and so on, then you should probably find a way to coordinate all together and get a socat replacement done.

How _not_ to fix glibc 2.10 function collisions

Following my previous how not to fix post I’d like to explain also how not to fix the other kind of glibc 2.10 failure: function collisions.

With the new version of glibc, support for the latest POSIX C library specification is added; this means that, among other things, a few functions that previously were only available as GNU extensions are now standardised and, thus, visible by default unless requesting a strictly older POSIX version.

The functions that were, before, available as GNU extensions were usually hidden unless the _GNU_SOURCE feature selection macro was defined; in autotools, that meant using the AC_SYSTEM_EXTENSIONS macro to request them explicitly. Now these are visible by default; this wouldn’t be a problem if some packages didn’t decide to either reimplement them, or call their functions just like that.

Most commonly the colliding function is getline(); the reason for which is that the name is really too generic and I would probably curse POSIX committees for accepting it with the same name in the C library; I already cursed GNU for adding it with that name to glibc. With the name of getline() there are over tons of functions, with the most different interfaces, that try to get lines from any kind of media. The solution for these is to rename them to some different name so that the collision is avoided.

More interesting is instead the software that, wanting to use something alike to strndup() decide to create its own version, because some system do lack that function. In this case, renaming the functions, like I’ve seen one user propose today, is crazy. The system already provide the function; use that!

This can be done quite easily with autotools-based packages (and can be applied to other build systems, like cmake, that work on the basis of understanding what the system provides):

# in configure.ac
AC_SYSTEM_EXTENSIONS
AC_CHECK_FUNCS([strndup])

/* in a private header */
#include "config.h"

#ifndef HAVE_STRNDUP
char *strndup(const char *str, size_t len);
#endif

/* in an implementation file */
#ifndef HAVE_STRNDUP
char *strndup(const char *str, size_t len)
{
  [...]
}
#endif

When building on any glibc (2.7+ at least, I’d say), this code will use the system-provided function, without adding further duplicate, useless code; when building on systems where the function is not (yet) available, like FreeBSD 7, then the custom functions will be used.

Of course it takes slightly more time than renaming the function, but we’re here to fix stuff in the right way, aren’t we?

The end of the mono-debugger saga

So after starting inspecting and finding the problem last night I finally had a tenative patch that makes mdb work fine.

Indeed, I simply implemented some extended debuglink file support into the Bfd managed wrapper, which finds sections and symbols in the debuglinked file whenever they are not found in the original file. This solves my problem, although it might not be complete yet, since I have written it in 20 minutes. I’ve attached the version for trunk on my bug report and I’ll add my backport to 2.4.2 to my overlay today. After a bit of testing, I hope to get it in main tree too.

Speaking of testing, the mono-debugger ebuild had a test restriction, with no bug referenced; I’m quite sure that the tests that do fail are the ones that should have told us that mono-debugger wouldn’t have worked on the default Gentoo install at all. I’ll probably have to add some logic to warn the user about split-debug setups (please not that our default of stripping files of debug information does not strip the symbol table of libpthread.so, otherwise also gdb wouldn’t work at all; and lets mdb work fine, so it’s only a problem with split-debug).

After the debugger finally started to work, I also found another problem: mono itself does not seem to load libraries requested by DllImport through the standard dlopen() interface, but it looks for them in particular directories; which don’t include all the possible directories at all. This became a problem because the current default version of libedit in Gentoo does not have a soname, and it caused mono to find a libedit.so that was not a library at all (but rather an ldscript). But that’s a problem for another day, and my solution is just to use a newer libedit version that works fine.

Now I’ll go back to my tinderbox, and in the next few days you’ll probably see a few more posts about different topics than Mono… even though I have a few patches to post there as well.

How to improve releases quality: working on PulseAudio 0.9.16

Just when I said that I was resuming my work as a PulseAudio maintainer in Gentoo, Lennart released a 0.9.16-test1 tarball. This was my cue to enter the scene upstream: the first test at packaging this in Gentoo failed, for a series of different reasons, some of which are internal (we don’t have the latest version of udev available yet, I hope we will by the time PulseAudio 0.9.16 final is releasd), but most are due to upstream changes that didn’t take into consideration some corner cases that Gentoo, as usual, gets to deal with.

So you won’t see the test1 (rc1) ebuild in the tree at all, you’ll probably have to wait for test2, and even that will require some work. For now I’ve fixed all the build- and run-time issues I’ve seen in the released tarball and git repository; plus I’ve been able to get it to properly build fully on both (Gentoo/)FreeBSD and OpenSolaris (with Prefix). I haven’t been able to experiment with actually having it playing yet, but it’ll come there at one point.

Unfortunately there are still a few shady details that I or someone else has to take care of. For instance, the tests still fail consistently: last time I tried them I got two failures on Yamato, one related to IPv6 enabled in PulseAudio build, but not enabled for the kernel, resulting in the IP ACL test asseting out (now I’ve fixed it, by warning of the case, and ignoring it as a failure); the other is the mixing test, which fails for everybody because it doesn’t know anything about the 24-bit and 24-bit-in-32-bit sample types; this I extended to support 24-bit, but was unable to do anything about the 24-bit-in-32-bit because I couldn’t grok it properly.

On non-Linux operating systems (FreeBSD and OpenSolaris), I had to work on a few more issues, like implicit declarations (there still is one in OpenSolaris), shadowed names, and of course there is some slight porting to be done, which I have nowhere near finished yet: the shm (Shared Memory) support in FreeBSD is imperfect, and for neither operating systems I’ve implemented the “get process name” function.

Okay I’m not able to provide a 100% porting to all the operating systems out there, but I still think I can do a bit to help out by making sure that PulseAudio won’t need to be extensively patched by all the porters out there. And until Lennart actually gets around merging my patches, you can find all them at gitorious so you can test them.

Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. This means you can’t access those patches anymore.

Back to origins

Thanks to the luck with QEmu this week I finally got a Gentoo/FreeBSD VM working again, so I can actually resume working on the one thing I joined Gentoo for, initially. The nice thing about this is that the project is, in itself, mostly an experimentation, which means it’s quite easygoing. But it also has some very interesting and useful results.

Every time people ask me why do I think Gentoo/FreeBSD is useful to something, I point out that by not using ports, we’re ensuring that the software builds out of the original sources, and if it doesn’t, we can provide the patches upstream, since we have to write them in a way that is compatible with other systems anyway. This hasn’t changed the slightest in the last two years I didn’t work on the project: ports maintainers still don’t seem to provide upstream with patches, and lots of software is quite broken.

Indeed, in the last couple of days I identified quite a few issues both in and outside of Gentoo: bsdtar from libarchive failed to work with latest Portage version (thanks to Tim who provided me a patch within the hour!), pambase was putting nologin in the wrong chain (fixed and pushed a new release out), sandbox does not compile (still broken, need to be investigated yet), PulseAudio was totally borked upstream (now I made it build but it still fails tests, need to fix and port some areas, and if I had the time there is also the OSS driver to fix), libSM has a dependency over libuuid (which collides in FreeBSD, where the system already provides a different, incompatible interface; I submitted a patch to use the FreeBSD uuid interface when available), and more.

I cannot blame the Gentoo/FreeBSD team for this, because, well, it’s just Alexis right now I guess; I’m getting my hands dirty and making sure I can get the thing to work as it’s supposed to, and this is the important part, I guess. On the other hand, I wonder why is it that FreeBSD developers don’t seem to care about this kind of problems at all. PulseAudio might not have the best OSS support, but that’s just because Lennart obviously don’t care about it (Fedora now also disabled it by default, good for them!), but if somebody were to actually mind PulseAudio (more than I can do, since I don’t have audio in my VM anyway), I don’t think it would be impossible for it to provide proper support for the FreeBSD OSS options.

At any rate, I guess I’m now back to my original plans as well, at least part time, hopefully it won’t be too bad on the long run. Going to try GCC 4.4 with the system packages, and the kernel, later on today. Or rather I’ll leave it to test the build since I’m actually supposed to be out of here to a friend’s house for some photo shoots (long story…).

Oh by the way, if you haven’t noticed I’m still making some changes to the blog, in particular now the tags and categories pages show decent titles; I have made some changes to Typo that allows me to set the titles in a more human-readable way. If I can find time I’ll be also cleaning up the tags, since I have lots of tags with one post each, and there are some synonyms that I should really get rid of. To do the latter, though, I’m going to write a script that can merge the tags’ contents and then set up redirection, since I dislike very much to break the links in my blog, as you may know already.

Oh well!

Virtualization updates

Seems like that one way or another a common “column” on my blog is reserved to virtualisation issues. I blogged a lot about VirtualBox (before dissing it finally ), and then I moved on to KVM and QEmu.

Last time I blogged about it, I was still unable to get NetBSD to detect any network card with KVM, while I had OpenSolaris, FreeBSD and Ubuntu working fine. I also had some problems with Gentoo/FreeBSD and the KVM video emulation. But since then, stuff changed, in particular, QEmu now supports KVM technologies natively (and it’s not yet updated to the latest version). Let’s see if this changed something.

Thanks to aperez I now know how to get NetBSD to identify the network card: disabling ACPI. Unfortunately disabling ACPI with KVM freezes the boot. And I want to use VDE for networking since I already have Yamato configured as a router and file server for the whole network, which seems to fail when using NetBSD with QEmu: while dhcpd receives the requests, the replies never reach NetBSD, and I’m stuck for now. I’m going to try again with the newer QEmu version. Also, out of all the cards I tried in QEmu, the Intel E1000 fails because it cannot find the EEPROM.

The Gentoo/FreeBSD video problem that stopped me from using vim during the configuration phase on the minimal CD does not happen when using QEmu; on the other hand since the SDL output is tremendously slow, I’m using the VNC support; quite nice if it wasn’t that Vinagre does not seem to support VNC over Unix sockets, which would make the whole configuration much nicer, without consuming precious network ports. I have to see if I just missed something, and if I didn’t, I should either request for it to be added, or write the support myself (even better). I guess that the underlying code supports the Unix socket since I expect the virt-manager to use that to communicate with the VM.

Speaking of which, I haven’t looked at virt-manager or anything in quite a while; I should see if they still insist on not giving me the choice of just using VDE for networking instead of dnsmasq and similar; for now the whole configuration is done manually with a series of aliases in my ~/.shrc file, with (manually) sequential MAC addresses hardcoded, as well as VNC ports, LVM volumes (used for the virtual disks, seem to be quite faster than using a file over VFS), and hostnames (in /etc/hosts beside for Ubuntu that has Avahi working).

I have to admit, though, that I have some doubts about the performances of QEmu/KVM versus the usual KVM, at least it’s taking quite a long time to unpack the tarball with the stage3 of Gentoo/FreeBSD 7.1. I hope I/O is the bottleneck here.

Speaking of I/O as bottleneck, I was finally able to get a gigabit switch for the office, the next step is to buy some many metres of cable so I can actually wire up my bedroom with the office, passing through a few other rooms of the house so that I can actually have a fast enough network for all the computers in their standard setup (and use wireless only when strictly needed). Although I do have some doubts about this since I really want to move out .

In the mean time, Enterprise is soon going to be re-used as a backup box, I just need to find an easy way to send a WOL packet, wait for the box to come up, backup everything, and shut down, once a week. I have the last unused 500GB disk on that box so it should be easy. But I’d like to have an mtree of the data that has been backed up, which I’m still unsure on how to get.

Shared Object Version

Picking up where I left with my post about ABI I’d like to provide some insights about the “soversion”, or shared object version, that is part of the “soname” (the canonical name of a shared object/library), and its relationship with the -version-info option in libtool.

First of all, the “soname” is the string listed in the DT_SONAME tag of the .dynamic section of an ELF shared object. It represents the canonical name the library should be called with, and it’s used to create the DT_NEEDED entries for the shared objects and dynamic executables depending on it, as well as the canonical name used when opening the library through dlopen() (without the full path).

Usually, the soname is composed of the library’s basename (libfoo.so) followed by a reduced shared object version, but the extent to which is reduced (or not) depends on the standard rules for the operating system and a few other notes. What I’m going to talk about today is that last part, the shared object version, which is probably the most important part of the soname.

First of all, the “soversion” does not correspond to either the package version nor the -version-info parameter (although it is calculated starting from that one); using either directly would be a big mistake, unless you expect to be able to keep a perfect ABI based on your package versioning, in which case you might want to try using the package version, but that’s quite a difficult thing to do.

The part of this version that is embedded in the soname is the version of the ABI, and has to change when the ABI is changed following the rules I shown previously. If this was kept the same between versions and the ABI was broken, software would be going to subtly crash because of the changes in the ABI. By changing the ABI version, and thus the soname, you make the loader refuse to start the program with a different library than it was developed for; of course it does not make the software magically work, but it will at least stop it from crashing further on along the road.

By default on Linux and Solaris, there is a single component used for the soname, as ABI version, at least with libtool, projects following, manually, this rule, and setting their soversion the same as their package version would be providing a single ABI for each major version of their software; I rarely have seen anything like that working out good. Ruby uses a mix of this, by defining two components as the soversion, so that eventually you could have libruby.so.1.8 and libruby.so.1.9 (on the other hand, we rename them to libruby18 and libruby19 so that they don’t collide for other reasons, but that’s beside the point). This works as long as they don’t have to change, for any reason, the ABI of a minor release of Ruby; when that happens, something will certainly break.

The -version-info of libtool is explicitly distinct from the package version, as well as the actual soversion, and is used to provide a consistent library versioning among releases, by providing three components: current, age and revision; they represent the information in form of API/ABI supported and dropped; understanding the separation is quite a time waste but it can be summarised in three simple steps:

  • if you don’t change the interface at all just increase the “interface revision” value;
  • if you make backward-compatible changes (like adding interfaces), increase the “current interface” value and the “older interface age” value, reset “interface revision” to zero;
  • if you make backward-incompatible changes, breaking ABI (removing interfaces for instance), increase the “current interface” value and reset both “older interface age” and ”interface revision” to zero.

Depending on the operating system, this will create a soname change either on backward-incompatible changes (Linux, Solaris and Gentoo/FreeBSD), or with any type of change to the interface (vanilla FreeBSD).

Again, the idea is that each time you might have a backward-incompatible change you get a different soname so that the loader can’t mix and match different interfaces. When you don’t guarantee any ABI stability between versions, usually for internal libraries, like GNU binutils do for libbfd, you just put the package name in the library’s basename rather than soversion, and set the soversion to all zeros, so you get stuff like libbfd-2.20.so.0.0.0. This way you’re sure that, whether you change interfaces or not, an upgrade of your package won’t break others’ software. Of course it should also be enough for people to understand that it should not be used at all since it’s not guaranteed to be stable.

Next step is going to describe the symbol versioning technique to reduce the amount of backward-incompatible changes, to keep the same ABI available until it really has to go.

Debian, Gentoo, FreeBSD, GNU/kFreeBSD

To shed some light and get around the confusion that seems to have taken quite a bit of people who came to ask me what I think about Debian adding GNU/kFreeBSD to the main archive, I’d like to point out, once again, that Gentoo/FreeBSD has never been the same class of project as Debian’s GNU/kFreeBSD port. Interestingly enough, I already said this before more than three years ago.

Debian’s GNU/kFreeBSD uses the FreeBSD kernel but keeps the GNU userland, which means the GNU C Library (glibc), the GNU utilities (coreutils) and so on so forth; on the other hand, Gentoo/FreeBSD uses both the vanilla FreeBSD kernel, and mostly vanilla userland. With mostly I mean that some parts of the standard FreeBSD userland are replaced, with either compatible, selectable or updated packages. For instance instead of shipping sendmail or the ISC dhcp packages as part of the base system, Gentoo/FreeBSD leaves them to be installed as extra packages, just like you’d do with Gentoo. And you can choose whichever cron software you’d like instead of using the single default provided by the system.

But, if a software is designed to build on FreeBSD, it usually builds just as fine on Gentoo/FreeBSD; rarely there are troubles, and most of the time the trouble are with different GCC versions. On the other hand, GNU/kFreeBSD require most of the system-dependant code to be ported, xine already has undergone this at least a couple of time for instance.

I sincerely am glad to see that Debian finally came to the point of accepting GNU/kFreeBSD into main; on the other hand, I have no big interest on it beside as a proof of concept; there are things that are not currently supported by glibc even on Linux, like SCTP, which on FreeBSD are provided by the standard C library; I’m not sure if they are going to port the Linux SCTP library to kFreeBSD or if they decided to implement the interface inside glibc. If that last one is the case, though, I’d be glad because it would finally mean that the code wouldn’t be left as stale.

So please, don’t mix in Gentoo/FreeBSD with Debian’s GNU/kFreeBSD. And don’t even try to call it Gentoo GNU/FreeBSD like the Wikipedia people tried to do.

Virtual machine, real problems

Since I bought Yamato I have been trying my best to make use of the AMD-V support in the Opterons, this included continuing the fight with VirtualBox to get it to work properly, with Solaris and Fedora, then trying RedHat’s virt-manager and now, after the failure the other day QEmu 0.10 (under Luca’s insistence).

The summary of my current opinions is something like the following:

  • VirtualBox is a nice concept but the limitation in the “Open Source Edition” are quite nasty, plus it has huge problems (at least, in the OSE version) with networking under Solaris (which is mind boggling for me since both products are developed by Sun), making it unusable for almost anything in my book; replacing the previously used Linux tun/tap code with its own network modules wasn’t very nice because it reminded me of VMware, and it didn’t solve much in my case;
  • RedHat’s virt-manager is a nice idea but it has limitations that are (quite understandably from one point of view) tied with the hardcoding of RedHat style systems; I say quite understandably because I’m not even dreaming to ask RedHat to support other operating systems before they feel their code is absolutely prime-time ready; on the other hand it would be nice if there was better support for Gentoo, maybe with an active branch for it;
  • I still don’t like the kqemu approach, so I’m hoping for the support to Linux’s own KVM interface in the next kernel release (2.6.29), but it should be feasible; on the other hand, setting up QEmu (or kvm manually) is quite a task the first time.

So while I’m using VirtualBox to virtualise a Windows XP install (which, alas, I have to use for some work tasks and to offer Windows support to my “customers”) I decided to try QEmu for a FreeBSD (vanilla) virtual system; I needed a vanilla FreeBSD to try a couple of things out, so that was a good choice to start. I was actually impressed by the sheer speed of FreeBSD install in the virtual system even without kqemu or KVM, it indeed took less than on my old test systems. I don’t know if the I/O difference between QEmu and VirtualBox was because VirtualBox uses more complex virtual disk images (with recovery data I expect), or because I set QEmu to work straight on a block device (lvm logical volume); I had, though, to curse a bit to get networking working.

A side on networking; since what I wanted was to be able to interface the virtual machines with the rest of my network transparently, I decided to give a try to net-misc/vde; unfortunately getting the thing working with that has been more troublesome than expected. For once, if you don’t set up the TAP device explicitly with OpenRC, vde will try to do so for you, but on my system, it put udev in a state that continuously took up and down the interface, quite unusable. Secondly, I had some problem with dhcpd: even if I set the DHCPD_IFACE variable in /etc/conf.d/dhcpd, the init script does not produce proper service dependencies, I have to explicitly set RC_NEED. In both those case the answer would be “dynamic” dependencies of the scripts, calculating the needed network services based on the settings in the conf.d files. I guess I should open bugs for those.

Once I finally had the networking support working properly, I set up SSH, connected and started the usual task of basic customisation. The first step for me is always to get zsh as shell. Not bash because I don’t like bash as a project, I find zsh more useful too. But once it started building m4, and in particular to test for strstr() time linearity, the virtual machine was frozen solid; qemu strted taking 100% CPU constantly, and even after half an hour it never moved from there. I aborted the VM and tried again, hoping it was just a glitch, but it’s perfectly reproducible. I don’t know what the problem is with that.

So I decided to give a try to installing Solaris, I created a new logical volume, started up again qemu and .. it is frozen solid during boot from the 2008.11 DVD.

In truth, I’m disappointed because the FreeBSD install really looked promising: fast, nice, not overloading more than a single core (I have eight, I can spare one or two for constantly-running VMs), it also worked fine when running as unprivileged user (my user) after giving it access to the kqemu device and the block device with the virtual disk; it didn’t work as nice with the tun/tap support in qemu itself in this setup since it required root to access the tap device, but at least with vde it reduced the amount of code running unprivileged.

On the other hand, since the KVM and QEmu configuration is basically identical (beside the fact that they emulate different network cards), I just tried again kvm, using the manual configuration I used for QEmu and vde for networking (networking configuration was what made me hope to use virt-manager last time, but now I know I don’t need it); it does seem faster, and it also passed the strstr() test before. So I guess the winner this round is KVM, and I’ll wait for the next Linux release to test the QEmu+Linux KVM support.

Post Scriptum: even KVM is unable to start the OpenSolaris LiveDVD though, so I wonder if it’s a problem with Solaris itself; I’d like to try the free as in soda version of Solaris 10, but the “Sun Download Manager” does not seem to work with IcedTea6 and downloading that 4GB image with Firefox is masochistic.