Virtualisation WTF once again.

To test some more RTSP clients I’ve been working to get more virtual machines available in my system; to do so I first extended the space available in my system by connecting one more half-a-terabyte hard drive (removing the DVD burner from Yamato), and then started again working on a proper init script for KVM/Qemu (as Pavel already asked me before, and provided me with an example).

Speaking about it, if somebody were to send my way an USB or FireWire DVD burner I’d be probably quite happy; while I have other three DVD burners around – iMac, MacBook Pro and Compaq laptop – having one on Yamato from time to time came out useful; not necessary, so wasting a SATA port for it was not really a good idea after all, but still useful.

I started writing a simple script before leaving for my vacation and extended it a bit more yesterday. But in line with the usual virtualisation woes the results aren’t excessively positive:

  • FreeBSD 8 pre-releases no longer seem to kernel panic when run in qemu (the last beta I tried did, the latest rc available does not); on the other hand it does seem to have problems with the default network (it works if started after boot but not at boot); it works fine with e1000;
  • NetBSD still is a desperate case: with qemu (and VDE) no network seem to work; e1000 is not even recognised, while the others end up timing out, silently or not; this is without ACPI enabled, if I do enable ACPI, no network card seems to be detected; with KVM, it freezes, no matter with or without ACPI, during boot up;
  • Pavel already suggested a method using socat and the monitor socket for qemu to shut down the VM cleanly; the shutdown request will cause the qemu or kvm instance to send the ACPI signal (if configured!) and then it would shut down cleanly… the problem is that the method requires socat, which is quite broken (even in the 2-beta branch).

Let me explain what the problem is with socat: its build system tries to identify the size of various POD types that are used by the code; to do so it uses some autoconf trickery, the -Werror switch and relies on pointer comparison to work with two POD types of the same size, even if different. Guess what? That’s no longer the case. A warning sign was already present: the code started failing some time ago when -Wall was added to the flags, so the ebuild strips it. Does that tell you something?

I looked into sanitizing the test; the proper solution would be to use run-test, rather than build-tests, for what I can see; but even if that’s possible, it’s quite intrusive and it breaks cross-compilation. So I went to look why the thing really needed to find the equivalents… and the result is that the code is definitely messy. It’s designed to work on pre-standard systems, and keep compatible with so many different operating systems that fixing the build system up is going to require quite a bit of code hacking as well.

It would be much easier if netcat supported handling of unix local sockets, but no implementation I have used seem to. My solution to this problem is to replace socat with something else; based on a scripting language, such as Perl so that’s as portable, and at the same time less prone to problems like those socat is facing now. I asked a few people to see if they can write up a replacement, hopefully this will bring us a decent replacement so we can kill that.

So if you’re interested in having a vm init script that works with Gentoo without having to deal with stuff like libvirt and so on, then you should probably find a way to coordinate all together and get a socat replacement done.

18 thoughts on “Virtualisation WTF once again.

  1. Do we really need such init.d-scripts?I’m using libvirt quiet successfully and if you need socat (and possibly other packages as well) you could also install libvirt instead.

    Like

  2. Last I checked, libvirt wanted to force me on its behaviour and its dependencies… I could have accepted it until they forced me to not use bridged network and instead rely on dnsmasq and NAT. Both options I didn’t want to use. And that’s without starting with the problems tied to the fact that libvirt and virt-manager end up beingThe init script by itself is quite easy to deal with; a socat-like script to send requests via unix sockets would be a very lightweight dependencies; the rest of the configuration would be up to the user with standard conf.d files; I sincerely like that approach much better.While I can see where libvirt is needed, for enterprise-grade virtualisation solutions. I definitely *cannot* see it in my system: it’s an overcomplex, over-engineered solution for me. It takes less effort for me to write an init script than it takes to get libvirt (without a GUI) to behave like I want it to… which says a lot.

    Like

  3. net-analyzer/netcat6 can connect to a UNIX domain socket, and OpenBSD’s netcat (which doesn’t seem to be in Portage) can both listen on and connect to a UNIX domain socket.

    Like

  4. Ah! Thanks ephemient! That seems to do the trick just fine and requires no new code to be written :D

    Like

  5. the biggest virtualization wtf for me is that qemu >=0.10.0 instantly takes my entire system down when used with kqemu module (qemu –enable-kqemu), no matter what i try to run under it.i remember that i tried to help somebody who could not get freebsd to recognize network card, running it under recent virtualbox. but we never figured it out, after all :/

    Like

  6. @Diego: Lucky me I read your blog before starting writing the script. Yay for netcat6 …@Dev-zero: I use kvm in smallscale on production servers. Libvirt was my first try. After days of not working networking I just gave up on that and started using own scripts. In the end, Diego picked up kvm to integrate it sanely into gentoo. I will be quite happy to use his work in production, without the abstraction bloat and enforced policies occurring in libvirt…

    Like

  7. @Pavel sorry I forgot to send you an email :/ I was looking for you last night then I went distracted by something else (I’ll write about it tomorrow most likely)…@Michael doesn’t seem to help either; disabled both from qemu side and from the NetBSD kernel side (at boot) but it still gets “re0: watchdog timeout” as soon as I get to the login request (with the rtl8139 which seems to be the driver that works better).

    Like

  8. Likely unrelated, but OpenBSD has that watchdog timeout issue, too, but only after installation, the installer works fine.Three the solution is this:Doing at the boot prompt: “bsd -c”, “disable mpbios”, then “quit”Make that permanent by doing the same steps in the prompt from “config -ef /bsd” after booting.Is that what you tried?

    Like

  9. I used the selection 4 at the boot prompt, which disables both ACPI and SMP, no luck. But one thing I noticed is that the timer does not work properly: instead of being in second it seems to be in 1/4 of seconds… that might explain why the watchdog kills the network card.

    Like

  10. On NetBSD ne2k_pci works much better than rtl8139. Using the latter I get re0 timeouts too and network is almost unusable.

    Like

  11. Ah good to know Michael! Now that seems to be doing it! ne2k_pci, disabled SMP, disabled ACPI and *finally* it seems to work! :)

    Like

  12. Pcnet can used as well, IIRC.Btw, I see the issue with amd64 hosts only. Futhermore, if kqemu is enabled the above hackery will not help at all – a deadlock happens after the “root file system type: ” message.

    Like

  13. I forgot to say, I use the i82557b one, because I know it best.Unfortunately I do not know if that will work for others, since I have been mostly unable to get my patches integrated into upstream, even when the patches were for stuff that crashed qemu or the specification obviously contradicts the implementation and it fixes networking for some OS.

    Like

  14. With qemu+kvm (rather than kqemu) it works fine with ACPI and SMP disabled it seems.Reimar, does that mean patches to qemu or NetBSD? If to qemu, I’m pretty sure Luca will be glad to receive them! :D

    Like

  15. Patches for qemu’s eepro100 driver.A rare few have been applied, but I have still 11 properly split ones pending and some messy ones.If someone if interested, I can mail them around on request, maybe I’ll somewhen try to resend them to qemu-devel or the “maintainer” but my motivation for that is really low.

    Like

  16. To make NetBSD work properly on kvm/qemu with ACPI enabled ACPI_SCANPCI should be disabled in kernel config.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s