So it’s that time of the year again when you have to run a few BIOS updates; and you know I love updating BIOS. In this case the victim is the very Excelsior that I bought to run the Gentoo tinderbox, and which I’m hosting to my expenses at Hurricane Electric in Fremont. This is important to say, because it means I have no direct access to the hardware, and nobody else can help me there.
First, I decided to get rid of the low-hanging fruit: the IPMI sideboard. I couldn’t find release notes for the update, but it went from version 2.05 to 3.16, and it was released at the end of June 2014, so it’s pretty recent, and one would expect quite a bit of bug fixes, given that I had trouble running its iKVM client two years ago already, and I tried to get more information about its protocol without success.
For this part, SuperMicro is strangely helpful: the downloaded archive (once I was able to actually download it, it took me a few tries, the connection would break midway through the download) is three times the size of the actual firmware, because it comes with Windows, DOS and, helpfully, Linux update utilities in the same file. Which would be endearing if it wasn’t that the Linux utilities require the tools to access the BMC raw, which is not fun to do with grsec present, and if I’m not mistaken it also conflicts with the in-kernel IPMI drivers. Also, the IPMI remote interface can handle the firmware update “quite well” by itself.
I say “quite well” because it seems like ATEN (the IPMI vendor for SuperMicro and another long list of server manufacturers) still has not mastered the ability to keep configuration consistent across firmware upgrades. You know, the kind of things that most of our home routers or cable TV boxes manage nearly all the time. And all my home computers’ BIOSes. So instead it requires you to reset the full configuration when you upgrade the firmware of the device. Which is fine because you can use bmc-config
from userspace to dump the configuration (minus the passwords) and reset it back. If the configuration it downloads is actually accepted as valid, which in my case it wasn’t. But I only needed to set an admin password and IP address and routing, so…
The second step is to update the BIOS of the server itself. And here is where the fun part starts. You have two options to download: a Windows executable or a ZIP archive. I downloaded both. The Windows executable is supposed to build a floppy to upgrade the BIOS. Not a floppy image, a floppy. It does indeed not work in Wine, but I tried! I settled for the ZIP file.
So the iKVM I just upgraded supports “virtual media” — which means you can give it files on your computer that it’ll use instead of floppy disks or CD-Rom. This is very neat, but it comes with some downsides. At first I wanted to give it a floppy disk image via HTTP, and use the Serial-over-SSH to enter the boot menu and do stuff. This has worked like a charm for me before, but it comes with limitations: the HTTP-based virtual media configuration for this iKVM only supports 1.44MB images for floppy disks, and uses SMB to access shared ISO files for the CD-Rom. It makes sense, you can’t expect the iKVM board to have enough RAM to store a whole 650MB image for the CD-Rom, now, can you?
The answer is of course to build a BIOS upgrade floppy disk, which is after all what their own Windows tool would have done (albeit on a real disk rather than an image — do people really still use floppy disks on Opteron servers?) Too bad that there is a small problem with that plan: the BIOS itself is over 2MB uncompressed. It does not fit in a floppy disk unformatted, let alone in the formatted disk, with the update utility, the kernel and the command interpreter. So no way to do that easily. The astutes of you after reading the blog post will probably figure out there was a shortcut here, but I’ll comment on that later.
So I decided to go back to the iKVM client, the Java WebStart one. Now luckily the new release of it works fine with IcedTea, unlike the previous one. Unfortunately it still is not working out of the box. Indeed it was only by chance that I found someone else who faced and solved that problem by adding a couple of lines to the XML file that defines the JavaWS environment.
The configuration dialog for the client allows providing a local ISO file as well as configuring the address of a SMB share with it: it implements its own transmission protocol, and the floppy option is now “floppy/USB”, so I thought maybe I could just give it my 1GB FreeDOS-based BIOS disk … nope, no luck. It still is limited to 1.44MB it seems.
So I had to get a working ISO and do the upgrade. It should be pretty straightforward, right? I even found a blog post showing how to do so, based on Gentoo, as it says to emerge isomaster
! Great! Unfortunately, FreeDOS download page now provides you with a great way to download … installers. They no longer provide links to download live CDs. It makes total sense, right? Who doesn’t think every day “I should stop procrastinating and finally install FreeDOS on my laptop!”?
After some fiddling around I found a tiny non-descript directory in their FTP mirrors that includes fdfull.iso
which is the actual LiveCD. So I download that, copy the BIOS and the updater into the CD and set it in the virtual media settings for the iKVM, and then boot. The CD does not boot unattended. If you leave it unattended it’ll ask you what you want to do and default to boot from the HDD. If you do remember to type 1 and press enter at the SysLinux prompt, it’ll then default to install (again, why did I bother installing Gentoo on the NUC when I could have used FreeDOS like the cool kids?).
Instead, you can choose to boot FreeDOS from the LiveCD with no support and no drivers, with just the CD drivers, with the CD drivers and HIMEM
, with the CD drivers, HIMEM
and EMM386
. I decided I wanted to play it safe since it’s a BIOS update, and booted with just the CD drivers (I needed the BIOS files after all). Turns out that while this is an option at the boot menu, it will never work, as the CDROM driver needs to be loaded high, MSCDEX.EXE
style. So reboot once again with HIMEM
(and EMM386
because rebooting into that CD is painful — among other things, the iKVM client loses focus every time the framebuffer size changes. And it changes. all. the freaking. time. during boot up), and try to access the CD… unfortunately the driver does not support USB CD-Rom (like the one the iKVM provides me with) but just IDE ones, so no access to the ISO for me. Nope.
But I’m not totally unprepared for this kind of situations. As a failsafe device I left in one of the USB slots a random flashdrive I got at Percona Live reflashed with SysRescueCD — it was one of those because I had gotten a couple while over there, and I forgot to bring one from home. Since SysRescueCD uses VFAT by default, and the BIOS shows the USB devices transparently to DOS, it was visible as C:
.
Here is the shortcut from above: I should have just used a random FreeDOS boot floppy image, and just copied the BIOS onto that particular thumbdrive, back before I started using Java. It would have been much simpler. But I forgot about that thumbdrive and I was hoping I would be able to start bootdisk. Now I know better.
Of course that meant I had to finish booting the system in Linux proper, mount the USB flashdrive, scp
over the BIOS update, and reboot. It took much longer than I wanted. But still much less than all the time trying to get FreeDOS into booting the right thing.
By the way, even if I were to find a way to access the CD with the bios updater, it wouldn’t have helped. I did not notice that, but the BIOS updater script FLASH.BAT
wants to be smart, and it renames AFUDOS.SMC
to AFUDOS.EXE
, runs it, then renames it back. Which means that it’s not usable from read-only media. Oh yeah and it requires a parameter, but if you don’t provide it, it’ll tell you that the command is unknown (because it still passes a bunch of flags to the flash utility), and tells you to reboot.
Now this is only the problem with getting the BIOS updated. What happened after would probably deserve a separate post, but I’ll try to be short and write that down here. After update, of course I got the usual “CMOS checksum error, blah blah, Press F1 to enter Setup.”. So I went and reconfigured the few parameters that I wanted different from the defaults, for instance put the SATA controller into AHCI mode (IDE compatibility, seriously?) and disabled the IDE controller itself, plus enabled SVM which on the defaults is still disabled.
The system booted fine, but no network cards were to be found. Not a driver problem, they would not appear on the PCI bus. To be precise, now that I compared this with the actual running setup, the two PLX PCI-E switches were not to be found on the bus, the network cards just happen to be connected to them. I tried playing with all the configuration options for the network cards, for PCI, for PnP and ACPI. Nope. Tried a full remote power cycle, nope. I was already considering the costs of adding a riser card and a multiport card to the server (wouldn’t have helped, as I noted the PLX were gone), when Janne suggested doing a reset to defaults anyway. It worked. The settings were basically the same I left before (looks like “optimal defaults” do include SVM), just changed SATA back to AHCI (yes, even in optimal settings, it defaults to IDE compatibility).
It worked. But then only three out of five drives were recognized (one of the SSDs was disconnected when I was there in April and I hadn’t had time to go back there in June to put it back). One more reboot into BIOS to disable the IDE controller altogether and the rest of the HDDs appeared, together with their LVM group.
Yes, I know, SuperMicro suggests you to only do BIOS updates when they require you to. On the other hand, I was hoping for an SVM/IOMMU fix which is not there still, as KVM made it complain, so I wanted to try. I was used to Tyan, that Yamato still uses, that basically told you “don’t bother us with support requests unless you updated your BIOS” — a policy that I find much more pleasant, especially given other experience with other hardware (such as Intel’s NUC requiring a BIOS update before it can use Intel’s WiFi cards….).