The virt-manager pain

So I’m still trying to come up with a decent way to parse those logs. Actually I lost a whole batch of them because I forgot to rename them before processing, but that made me realize that the best thing I can do is process the logs outside of the tinderbox itself. But this is a topic for another time.

Another thing I’m trying to set up is a virtual machine to test the new x32 ABI that Mike has made available in Gentoo. This is important for the tinderbox as well as for FATE (which is already being run for a standard Gentoo Hardened setup on that very same hardware). Unfortunately I can’t use this via LXC, yet — simply because the kernel currently running does not support the x32 executables yet.

This means that I have to use KVM and go full-virtualisation. Thanks to Tim, I’ve got a modified SysRescueCD ISO that should let me take care of the install. Unfortunately, this is not easy to deal with for a number of reasons, still.

The first is that virt-manager is just slow, and painful, as some kind of slow and painful death. The whole idea of using a frontend that connects through SSH is that you don’t want to “feel” the lag… but virt-manager makes you feel the lag even more than a command-line SSH connection. I’m under the impression that the guys who work on that kind of stuff only ever tried this on a local connection, and never from the other side of the world. I mean, I understand you might have concurrency issues, but do you really have to make me wait for two minutes to switch from CPU settings to Memory settings to Disk setting when editing the VM?

The second issue is that even though I was able to set up a testing VM for x32… qemu doesn’t like the additional instruction sets (SIMD) that Bulldozer comes with; something within the C library causes every x32 binary to be killed with SIGILL (Illegal Instruction). The problem is likely in some of the indirect-binding functions that are being used — my guess based on the NEWS file of the 2.15 release is strcasecmp() which has been optimised through the use of SSE4.2 and AVX (both of which are available on that server) — I have a 34 written, half drawn post about this kind of optimisations in my queue, I’ll see if I can post it over the weekend.

The end result is that I spent the most part of three hours on virt-manager before accepting that the way to go is to update the host’s kernel and just run the usual container. Just so you know, the final step that “creates” the VM (which is not the LVM allocation step!) took me over half an hour. This is silly, what the heck was it doing during that time?

Oh and yes, two years afterwards virt-manager still keeps defunct ssh processes around because it never reaps them (check the comments).

Right now I’m trying to get this to work with LXC, but I’m not having much luck with it either; and yes I did update the init script to handle correctly the x32 containers, that didn’t work correctly before… it might have some problems if you’re going to use this on SPARC, because I’m not handling those properly yet, but this is (again) a topic for another time.

I know you missed them: virtualisation ranting goes on!

While I started writing init scripts for qemu I’ve been prodded again by Luca to look at libvirt instead of reinventing the wheel. You probably remember me ranting about the whole libvirt and virt-manager suite quite some time ago as it really wasn’t my cup of tea. But then I gave it another try.

*On a very puny note here, what’s up with the lib- prefix here? libvirt, libguestfs, they don’t look even remotely like libraries to me… sure there is a libvirt library in libvirt, but then shouldn’t the daemon be called simply virtd?*

The first problem I found is that the ebuild still tries to force dnsmasq and iptables on me if I have the network USE flags enabled; turns out that neither is mandatory so I have to ask Doug to either drop them or add another USE flag for them since I’m sure they are pain in the ass for other people beside me. I know quite a bit of people ranted about dnsmasq in particular.

Sidestepped that problem I first tried, again, to use the virt-manager graphical interface to build a new VM interface. My target this time was to try re-installing OpenSUSE, this time, though, using the virtio disk interface.

A word of note about qemu vs. qemu-kvm: at first I was definitely upset by the fact that the two cannot be present on the same system, this is particularly nasty considering the fact that it takes a little longer to get the qemu-kvm code bumped when a new qemu is released. On the other hand, after finding out that, yeah, qemu allows you to use virtio for disk device but no, it doesn’t allow you to boot from them, I decided that upstream is simply going crazy. Reimar maybe you should send your patches directly to qemu-kvm, they would probably be considered I guess.

The result of the wizard was definitely not good; the main problem was that the selection for the already-present hard disk image silently failed; I had to input the LVM path myself, which at the time felt a minor problem (although another strange thing was that it could see just one out of the two volume groups I have in the system); but the result was … definitely not what I was looking for.

First problem was that the selection dialog that I thought was not working was working alright… just on the wrong field, so it replaced the path to the ISO image to use for installing with that of the disk again (which as you might guess does not work that well). The second problem was that even though I set explicitly that I wanted to use a Linux version with support for virtio devices, it didn’t configure it to use virtio at all.

Okay, time to edit the configuration file by hand; I could certainly use virt-manager to replace vinagre to access the VNC connections (over unix path instead of TCP/IP to localhost), so that would be enough to me. Unfortunately the configuration file declares it to be XML; if you know me you know I’m not one of those guys who just go away screaming as soon as XML is involved, even though I dislike it as a configuration file it makes probably quite a bit of sense in this case, I found by myself trying to make the init script above usable that the configuration for qemu is quite complex. The big bad turn down for me is that *it’s not XML, it’s aXML (almost XML)!

With the name aXML I call all those uses of XML that are barely using the syntax but not the features. In this particular case, the whole configuration file, while documented for humans, is lacking an XML declaration as well as any kind of doctype or namespace that would tell a software like, say, nxml, what the heck is it dealing with. And more to the point, I could find no Relax-NG or other kind of schema for the configuration file; with one of those, I could make it possible for Emacs to become a powerful configuration file editor: it would know how to validate the syntax and allow completion of elements. Lacking it, it’s quite a task for the human to look at.

Just to make things harder, the configuration file, which, I understand, has to represent very complex parameters that the qemu command line accepts, is not really simplified at all. For instance, if you configure a disk, you have to choose the type between block and file (which is normal operation even for things like iSCSI); unfortunately to configure the path the device or file is found you don’t simply have a <source>somepath</source> element but you need to provide <source dev="/path" /> or <source file="/path" /> — yes, you have to change the attribute name depending on the type you have chosen! And no, virsh does not help you by telling you that you had an invalid attribute or left one empty; you have to guess by looking at the logs. It doesn’t even tell you that the path to the ISO image you gave is wrong.

But okay, after reconfiguring the XML file so that the path is correct, that network card and disks are to use virtio and all that stuff, as soon as you start you can see a nice -no-kvm in the qemu command line. What’s that? Simple: virt-manager didn’t notice that my qemu is really qemu-kvm. Change the configuration to use kvm and surprise surprise: libvirtd crashes! Okay to be fair it’s qemu that crashes first and libvirtd follows it, but the whole point is that if qemu is the hypervisor, libvirtd should be the supervisor and not crash if the hypervisor it launched doesn’t seem to work.

And it goes even funnier: if I launch as root the same qemu command, it starts up properly, without network but properly. Way to go libvirt; way to go. Sigh.

Free virtualisation – not working yet

If you’ve been following my blog for a while you probably remember how much I fought with VirtualBox once it was released to get it to work, so that I could use OpenSolaris. Nowadays, even with some quirks, VirtualBox Open Source Edition is fairly usable, and I’m using it not only for OpenSolaris but also for a Fedora system (which I use for comparing issues with Gentoo), a Windows install (that I use for my job), and a Slackware install that I use for kernel hacking.

Obviously, the problem is that the free version of VirtualBox come with some disadvantages, like not being able to forward USB devices, having limited type of hardware to virtualise and so on. This is not much of a problem for my use, but of course it would have been nicer if they just open sourced the whole lot. I guess the most obnoxious problem with VirtualBox (OSE at least, not sure about the proprietary version) is the inability to use a true block device as virtual disk, but rather having to deal with the custom image format that is really slow at times, and needs to pass through the VFS.

For these reasons Luca suggested me many times to try out kvm instead, but I have to say one nice thing of VirtualBox is that it has a quite easy to use interface which allows me to set up new virtual machines in just a few clicks. And since nowadays it also supports VT-x and similar, it’s not so bad at all.

But anyway, I wanted to try kvm, and tonight I finally decided to install it, together with the virt-manager frontend; although there are lots of hopes for this, it’s not yet good enough, and it really isn’t usable for me at all. I guess I might actually get to hack at it, but maybe this is a bit too soon yet.

Continue reading on my blog for the reasoning, if you’re interested.
One thing I really dislike of the newer versions of VirtualBox is that it went the VMware route and decided to create its own kernel module to handle networking. They said it’s for performance reasons but I’d expect the main reason is that it would allow them to have a single interface between Linux, Windows, Solaris and so on. The TUN/TAP interface is Linux-specific so supporting that together with whatever they have been doing on the other operating systems is likely a cost for them. Although I can understand their reasoning, it’s not that I like it at all. More kernel code means more code that can crash your system, especially when not using the Open Source Edition.

Interestingly enough, the RedHat’s Virtual Machine Manager is instead doing its best to avoid creating new network management software, and uses very common pieces of software: dnsmasq as DHCP server, the Linux kernel bridging support, and so on. This is very good, but it also poses a few problems: first of all, my system already runs a DHCP server, why do I have to install another? But it’s not just that; instead of creating an userspace networking bridging software, like VirtualBox does, it tries to use what the kernel provides already, in the form of the bridging support and iptables to forward requests and create NAT zones. This is all fine and dandy, as it reduces the feature duplication in software, but obviously requires more options to be enabled in kernels that might not otherwise have it enabled at all.

As it happens, my system does have bridging support installed, but not iptables nor masquerade targets and similar. Up to now I never used them so it had no reason to be there. I also sincerely can’t understand why it does need it if I don’t want a NAT, but it doesn’t seem to allow me to set up the network myself. Which would include me being able to just tell it to create a new interface and add it to a bridge I manage myself, leaving to me all the details like dhcp, and thus not requiring iptables at all.

Also, even though there is a way to configure an LVM-based storage pool, it does not seem to allow me to choose directly one of the volumes of that pool when it asks me what to use as a virtual disk. Which is kinda silly, to me.

But these are just minor annoyances, there are much bigger problems: if the modules for kvm (kvm and kvm-amd in my system) are not loaded when virt-manager is loaded, it obviously lack a way to choose KVM as hypervisor. This is nice, but it also fails to see that there is no qemu in the system, and tries to use a default path, that is not only specific to RedHat, but also very not existing here (/usr/lib64/xen/bin/qemu-dm, on x86-64, just checking the uname!), returning an error that the particular command couldn’t be found. At least it should probably have said that it couldn’t find qemu, rather than the particular path. It would have also been nice to still allow choosing kvm but then telling that the device was missing (and suggested loading the modules; VirtualBox does something like that already).

To that you have to add that I haven’t been able to finish the wizard to create a new virtual machine yet. This because it does not check the permission to create the virtual machine before proposing you to create one, so it let you spend time to fill in the settings in the wizard and then fails telling you don’t have permission to access the read/write socket. Which by default is accessible only by root.

Even though it’s obvious by the 0-prefix that the software is still not production-level, there are also a few notes on the way the interfaces are designed. While it’s a good idea to use Python to write the interface, since it allows a much faster test of the code and so on, and speed is not a critical issue here, as the work is carried out by C code in the background, every error is reported with a Python traceback, one of the most obnoxious things to do for users. In particular for the permission problem I just listed, the message is a generic error message: “Unable to complete install: ‘unable to connect to ‘/var/run/libvirt/libvirt-sock’: Permission denied’”; what is the user supposed to know here? Of course virtualisation is not something the user at the first tries with Linux is going to use, but still, even a power user might be scared by errors that appear this way and have attached a traceback (that most users would probably rationally link with “the program crashed” like bug-buddy messages, rather than “something wasn’t as the software expected”, which is what happened.

On a very puny level, instead, I have to notice that just pressing Enter to proceed to the next page on the wizard fails, and using ESC to close the windows too. Which is very unlike any other commonly-used software.

So to cut the post short now, what is the outcome of my test? Well, I didn’t get to test the actual running performance of KVM against VirtualBox, so I cannot say much about that technology. I can say that there is a long road ahead for the Virtual Machine Manager software to become a frontend half as good as VirtualBox’s or VMware’s. This does not mean that the software was badly written, at all. The software by design is not bad at all; of course it’s very focused on RedHat and less on other distributions, which explains lots of the problems I’ve noticed, but in general I think it’s going the right way. A more advanced setup for advanced users would probably be welcome by me, as well as an ISO images management like the one VirtualBox has (even better if you can assign an ISO image to be of a specific operating system, so that when I choose “Linux”, “Fedora” it would list me the various ISO for Fedora 8, 9, 10 and then have a “Other” button to take a different ISO to install, if desired.

I’ll keep waiting to see how it evolves.