Remotely upgrading BIOS on a SuperMicro servers

So it’s that time of the year again when you have to run a few BIOS updates; and you know I love updating BIOS. In this case the victim is the very Excelsior that I bought to run the Gentoo tinderbox, and which I’m hosting to my expenses at Hurricane Electric in Freemont. This is important to say, because it means I have no direct access to the hardware, and nobody else can help me there.

First, I decided to get rid of the low-hanging fruit: the IPMI sideboard. I couldn’t find release notes for the update, but it went from version 2.05 to 3.16, and it was released at the end of June 2014, so it’s pretty recent, and one would expect quite a bit of bug fixes, given that I had trouble running its iKVM client two years ago already, and I tried to get more information about its protocol without success.

For this part, SuperMicro is strangely helpful: the downloaded archive (once I was able to actually download it, it took me a few tries, the connection would break midway through the download) is three times the size of the actual firmware, because it comes with Windows, DOS and, helpfully, Linux update utilities in the same file. Which would be endearing if it wasn’t that the Linux utilities require the tools to access the BMC raw, which is not fun to do with grsec present, and if I’m not mistaken it also conflicts with the in-kernel IPMI drivers. Also, the IPMI remote interface can handle the firmware update “quite well” by itself.

I say “quite well” because it seems like ATEN (the IPMI vendor for SuperMicro and another long list of server manufacturers) still has not mastered the ability to keep configuration consistent across firmware upgrades. You know, the kind of things that most of our home routers or cable TV boxes manage nearly all the time. And all my home computers’ BIOSes. So instead it requires you to reset the full configuration when you upgrade the firmware of the device. Which is fine because you can use bmc-config from userspace to dump the configuration (minus the passwords) and reset it back. If the configuration it downloads is actually accepted as valid, which in my case it wasn’t. But I only needed to set an admin password and IP address and routing, so…

The second step is to update the BIOS of the server itself. And here is where the fun part starts. You have two options to download: a Windows executable or a ZIP archive. I downloaded both. The Windows executable is supposed to build a floppy to upgrade the BIOS. Not a floppy image, a floppy. It does indeed not work in Wine, but I tried! I settled for the ZIP file.

So the iKVM I just upgraded supports “virtual media” — which means you can give it files on your computer that it’ll use instead of floppy disks or CD-Rom. This is very neat, but it comes with some downsides. At first I wanted to give it a floppy disk image via HTTP, and use the Serial-over-SSH to enter the boot menu and do stuff. This has worked like a charm for me before, but it comes with limitations: the HTTP-based virtual media configuration for this iKVM only supports 1.44MB images for floppy disks, and uses SMB to access shared ISO files for the CD-Rom. It makes sense, you can’t expect the iKVM board to have enough RAM to store a whole 650MB image for the CD-Rom, now, can you?

The answer is of course to build a BIOS upgrade floppy disk, which is after all what their own Windows tool would have done (albeit on a real disk rather than an image — do people really still use floppy disks on Opteron servers?) Too bad that there is a small problem with that plan: the BIOS itself is over 2MB uncompressed. It does not fit in a floppy disk unformatted, let alone in the formatted disk, with the update utility, the kernel and the command interpreter. So no way to do that easily. The astutes of you after reading the blog post will probably figure out there was a shortcut here, but I’ll comment on that later.

So I decided to go back to the iKVM client, the Java WebStart one. Now luckily the new release of it works fine with IcedTea, unlike the previous one. Unfortunately it still is not working out of the box. Indeed it was only by chance that I found someone else who faced and solved that problem by adding a couple of lines to the XML file that defines the JavaWS environment.

The configuration dialog for the client allows providing a local ISO file as well as configuring the address of a SMB share with it: it implements its own transmission protocol, and the floppy option is now “floppy/USB”, so I thought maybe I could just give it my 1GB FreeDOS-based BIOS disk … nope, no luck. It still is limited to 1.44MB it seems.

So I had to get a working ISO and do the upgrade. It should be pretty straightforward, right? I even found a blog post showing how to do so, based on Gentoo, as it says to emerge isomaster! Great! Unfortunately, FreeDOS download page now provides you with a great way to download … installers. They no longer provide links to download live CDs. It makes total sense, right? Who doesn’t think every day “I should stop procrastinating and finally install FreeDOS on my laptop!”?

After some fiddling around I found a tiny non-descript directory in their FTP mirrors that includes fdfull.iso which is the actual LiveCD. So I download that, copy the BIOS and the updater into the CD and set it in the virtual media settings for the iKVM, and then boot. The CD does not boot unattended. If you leave it unattended it’ll ask you what you want to do and default to boot from the HDD. If you do remember to type 1 and press enter at the SysLinux prompt, it’ll then default to install (again, why did I bother installing Gentoo on the NUC when I could have used FreeDOS like the cool kids?).

Instead, you can choose to boot FreeDOS from the LiveCD with no support and no drivers, with just the CD drivers, with the CD drivers and HIMEM, with the CD drivers, HIMEM and EMM386. I decided I wanted to play it safe since it’s a BIOS update, and booted with just the CD drivers (I needed the BIOS files after all). Turns out that while this is an option at the boot menu, it will never work, as the CDROM driver needs to be loaded high, MSCDEX.EXE style. So reboot once again with HIMEM (and EMM386 because rebooting into that CD is painful — among other things, the iKVM client loses focus every time the framebuffer size changes. And it changes. all. the freaking. time. during boot up), and try to access the CD… unfortunately the driver does not support USB CD-Rom (like the one the iKVM provides me with) but just IDE ones, so no access to the ISO for me. Nope.

But I’m not totally unprepared for this kind of situations. As a failsafe device I left in one of the USB slots a random flashdrive I got at Percona Live reflashed with SysRescueCD — it was one of those because I had gotten a couple while over there, and I forgot to bring one from home. Since SysRescueCD uses VFAT by default, and the BIOS shows the USB devices transparently to DOS, it was visible as C:.

Here is the shortcut from above: I should have just used a random FreeDOS boot floppy image, and just copied the BIOS onto that particular thumbdrive, back before I started using Java. It would have been much simpler. But I forgot about that thumbdrive and I was hoping I would be able to start bootdisk. Now I know better.

Of course that meant I had to finish booting the system in Linux proper, mount the USB flashdrive, scp over the BIOS update, and reboot. It took much longer than I wanted. But still much less than all the time trying to get FreeDOS into booting the right thing.

By the way, even if I were to find a way to access the CD with the bios updater, it wouldn’t have helped. I did not notice that, but the BIOS updater script FLASH.BAT wants to be smart, and it renames AFUDOS.SMC to AFUDOS.EXE, runs it, then renames it back. Which means that it’s not usable from read-only media. Oh yeah and it requires a parameter, but if you don’t provide it, it’ll tell you that the command is unknown (because it still passes a bunch of flags to the flash utility), and tells you to reboot.

Now this is only the problem with getting the BIOS updated. What happened after would probably deserve a separate post, but I’ll try to be short and write that down here. After update, of course I got the usual “CMOS checksum error, blah blah, Press F1 to enter Setup.”. So I went and reconfigured the few parameters that I wanted different from the defaults, for instance put the SATA controller into AHCI mode (IDE compatibility, seriously?) and disabled the IDE controller itself, plus enabled SVM which on the defaults is still disabled.

The system booted fine, but no network cards were to be found. Not a driver problem, they would not appear on the PCI bus. To be precise, now that I compared this with the actual running setup, the two PLX PCI-E switches were not to be found on the bus, the network cards just happen to be connected to them. I tried playing with all the configuration options for the network cards, for PCI, for PnP and ACPI. Nope. Tried a full remote power cycle, nope. I was already considering the costs of adding a riser card and a multiport card to the server (wouldn’t have helped, as I noted the PLX were gone), when Janne suggested doing a reset to defaults anyway. It worked. The settings were basically the same I left before (looks like “optimal defaults” do include SVM), just changed SATA back to AHCI (yes, even in optimal settings, it defaults to IDE compatibility).

It worked. But then only three out of five drives were recognized (one of the SSDs was disconnected when I was there in April and I hadn’t had time to go back there in June to put it back). One more reboot into BIOS to disable the IDE controller altogether and the rest of the HDDs appeared, together with their LVM group.

Yes, I know, SuperMicro suggests you to only do BIOS updates when they require you to. On the other hand, I was hoping for an SVM/IOMMU fix which is not there still, as KVM made it complain, so I wanted to try. I was used to Tyan, that Yamato still uses, that basically told you “don’t bother us with support requests unless you updated your BIOS” — a policy that I find much more pleasant, especially given other experience with other hardware (such as Intel’s NUC requiring a BIOS update before it can use Intel’s WiFi cards….).

Why Puppet?

Seems like the only thing everybody has to comment on my previous post was to ask me why I haven’t used $this, $that and ${the kitchen sink}. Or to be precise they asked about cfengine, chef and bcfg2. I have to say I don’t really like being forced into justifying myself but at this point I couldn’t just avoid answering, or I would keep getting the same requests over and over again.

So first of all, why a configuration management system? I have three production vservers at IOS (one is this, another is xine, and another is a customer’s of mine). I have a standby backup server at OVH. And then there’s excelsior, which has four “spacedocks” (containers that I use for building binpkgs for the IOS servers), three tinderbox (but only two usually running), and a couple of “testing” containers (for x32 and musl), beside the actual container I use in it to maintain stuff.

That’s a lot of systems, and while they are very similar between themselves, they are not identical. To begin with, they are in three different countries. And they us three different CPUs. And this is without adding the RaspberryPi I set up with the weather station for a friend of mine. The result is that trying to maintain all those systems manually is a folly, even though I already reduced the number of hosts, since the print shop customer – the one I wrote so often about – moved on and found someone else to pick up their sysadmin tasks (luckily for both of us, since it was a huge time sink).

But the reason why I focused almost exclusively onto Puppet is easy to understand: people I know have been using it for a while. Even though this might sound stupid, I do follow the crowd of friends of mine when I have to figure out what to use. This is because the moment when I have no idea how to do something, it’s easier to ask to a friend than going through the support chain at the upstream project. Gentoo infra people are using and working on Puppet, so that’s a heavy factor to me. I don’t know why they chose puppet but at this point I really don’t care.

But there is another thing, a lesson I learned with Munin: I need to judge the implementation language. The reason is simple, and that’s that I’ll find bugs, for sure. I have this bad knack at finding bugs in stuff I use… which is the main reason why I got interested in open source development: I could then fix the bugs I found! But to do so I have to understand what’s written. And even though learning Perl was easy, understanding Munin’s code… was, and is, tricky. I was able to get some degree of stuff done. Puppet being written in Ruby is a positive note.

I know, chef is also written in Ruby. But I do have a reason to not wanting to deal with chef: its maintainer in Gentoo. Half the bugs I find have to do with the way things are packaged, which is the reason why I became a developer in the first place. This means though that I have to be able to collaborate with the remaining developers, and sometimes that’s just not possible. Sometimes it’s due to upstream developers but in the case of chef the problem is the Gentoo developer who’s definitely not somebody I want to work with, since he’s been “fiddling” with Ruby ebuilds for chef messing up a lot of the work that the Ruby team, me included, kept pouring to improve the quality of the Ruby packages.

So basically these are the reason why I decided to start using Puppet and writing Puppet modules.

Managing configuration

So I’ve finally bit the bullet and decided to look into installing and setting up Puppet to manage the configuration of my servers. The reason is to be found in my transfer to Dublin as I expect I won’t have the same time I had before, and that means that any streamlining in the administration of my servers is a net improvement.

In particular, just the other day I spent a lot of time fighting just to set up SSL properly on the servers, and I kept scp’ing files around — it was obvious I wasn’t doing it right.

It goes deeper than this though; since Puppet is obviously trying to get me to standardize the configurations between different servers, I’ve ended up uncovering a number of situations where the configuration of different servers was, well, different. Most of the times without a real reason. For instance, the Munin plugins configured didn’t match, even those that are not specific to a service — of three vservers, one uses PostgreSQL, another uses MySQL and the third, being the standby backup for the two, has both.

Certainly there’s a conflict between your average Gentoo Linux way to do things and the way Puppet expects things to be done. Indeed, the latter requires you to make configurations very similar, while the former tends to make you install each system like its own snowflake — but if you are even partially sane, you would know that to manage more than one Gentoo system, you’ll have to at least standardize some configurations.

The other big problem with using Puppet on Gentoo is that there is a near-showstoppper lack of modules that support our systems. While Theo and Adrien are maintaining a very nice Portage module, there is nothing that allows us to set the OpenRC oldnet-style network configuration, for instance. For other services, often times the support is written with only CentOS or Debian in mind, and the only way to get them to work in Gentoo is to fix the module.

To solve this problem, I started submitting pull requests to modules such as timezone and ntp so that they work on Gentoo. It’s usually relatively easy to do, but it can get tricky, when the CentOS and Gentoo way to set something up are radically different. By the way, the ntp module is sweet because finally I can no longer forget that we have two places to set the NTP server pools.

I also decided to create a module to fit in whatever is Gentoo-specific enough, although this is not yet the kind of stuff you want to rely upon forever — it would have to be done through a real parsed file to set it up properly. On the other hand it allows me set up all my servers’ networks, so it should be okay. And another module allows me to set environment variables on different systems.

You can probably expect me to publish a few more puppet modules – and editing even more – in the next few weeks while I transition as much configuration as I can from custom files to Puppet. In particular, but that’s worth of a separate blog post, I’ll have to work hard to get a nice, easy, and dependable Munin module.

Apple’s Biggest Screwup — My favourite contender

Many people have written about Apple’s screwups in the past — recently, the iPad Mini seems to be the major focus for about anybody, and I can see why. While I don’t want to argue I know their worst secret, I’m definitely going to show you one contender that could actually be fit to be called Apple’s Biggest Screwup: OS X Server.

Okay, those who read my blog daily already know that because I blogged this last night:

OS X Server combines UNIX’s friendliness, with Windows’s remote management capabilities, Solaris’s hardware support, and AIX software availability.

So let’s try to dissect this.

UNIX friendliness. This can be argued both positively and negatively — we all know that UNIX in general is not very friendly (I’m using the trademarked name because OS X is actually using FreeBSD which is UNIX), but it’s friendlier to have an UNIX server than a Windows one. So if you want to argue it negatively, you don’t have all the Windows-style point-and-click tools for every possible service. If you want to argue it positively, you’re still running solid (to a point) software such as Apache, BIND, and so on.

Windows’s remote management capabilities. This is an extremely interesting point. While, as I just said, OS X Server provides you with BIND as DNS server, you’re supposed not to edit the files by hand but leave it to Apple’s answer to the Microsoft Management Console — ServerAdmin. Unfortunately, doing so remotely is hard.

Yes because even though it’s supposed to be usable from a remote host, it requires that you’re using the same version on both sides, and that is impractical if your server is running 10.6, and your only client at hand is updated to 10.8. So this option has to be dropped entirely in most cases — you don’t want to keep updating to the latest OS your server, but you do so for your client, especially if you’re doing development on said client. Whoops.

So can you launch it through an SSH session? Of course not, because despite all people complaining about X11, the X protocol, and SSH X11 forwarding, are a godsend for remote management, if you have things like very old versions of libvirt and friends, or some other tool that can only be executed on a graphical environment, you only need another X with an SSH client and you’re done.

Okay so what can you do? Well, the first option would be to do it locally on the box, but that’s not possible, so the second best would be to use one of the many remote desktop techniques — OS X Server comes with Apple’s Remote Desktop server by default. While this is using the VNC standard 5900 port… it seems like it does not work with a standard VNC client such as KRDC. You really need Apple’s Remote Desktop Client, which is a paid-for proprietary app. Of course you can set up one of many third party apps to connect to it, but if you didn’t think about that when installing the server, you’re basically stuck.

And I’m pretty sure that this does not limit itself to the DNS server, but Apache, and other servers, will probably have the same issues.

Solaris’s hardware support. This should be easy, if you ever tried to run Solaris on real hardware, rather than just virtualized – and even then … – you know that it’s extremely picky. Last time I tried it, it wouldn’t run on a system with SATA drives, to give you an idea.

What hardware can OS X Server run on? Obviously, only Apple hardware. If you’re talking about a server, you have to remove from the equation all their laptops, obviously. If it’s a local server you could use an iMac, but the problem I’ve got is that it’s used not locally but at a co-location. The XServe, which was the original host for OS X Server, is now gone forever, and that leaves us with only two choices: Mac Pro and Mac Mini. Which are the only ones that are sold with that version of OS X anyway.

The former hasn’t been updated in quite a long time. It’s quite bulky to put at a co-location, even though I’ve seen enough messy racks to know that somebody could actually think about bringing it there. The latter actually just recently got an update that makes it sort of interesting, by giving you a two-HDDs option…

But you still get a system that has 2.5”, 5400 RPM disks at most, with no RAID, and that’s telling you to use external storage if you need anything difference. And since this is a server edition, it comes with no mouse or keyboard, just adding those means adding another $120. Tell me again why would anybody in their sane mind use one of those for a server? And no don’t remind that I could have an answer on the tip of my tongue.

For those who might object that you can fit two Mac Minis on 1U – you really can’t, you need a tray and you end up using 2U most of the time anyway – you can easily use something like SuperMicro’s Twins that fits two completely independent nodes on a single 1U chassis. And the price is not really different.

The model I linked is quote, googling, at around eighteen hundreds dollars ($1800); add $400 for four 1TB hard disks (WD Caviar Black, that’s their going price as I ordered, since last April, eight of them already — four for Excelsior, four for work), you get to $2200 — two Apple Mac Minis? $2234, with mouse and keyboard that you need (the Twin system has IPMI support and remote KVM, so you don’t need them).

AIX’s software availability. So yes, you can have MacPorts, or Gentoo Prefix, or Fink, or probably a number of other similar projects. The same is probably true for AIX. How much software is actually tested on OS X Server? Probably not much. While Gentoo Prefix and MacPorts cover most of the basic utilities you’d use on your UNIX workstation, I doubt that you’ll find the complete software coverage that you currently find for Linux, and that’s often enough a dealbreaker.

For example, I happen to have these two Apple servers (don’t ask!). How do I monitor them? Neither Munin nor NRPE are easy to set up on OS X so they are yet unmonitored, and I’m not sure if I’ll ever actually monitor them. I’d honestly replace them just for the sake of not having to deal with OS X Server anymore, but it’s not my call.

I think Apple did quite a feat, to make me think that our crappy HP servers are not the worst out there…

Revenge of the HP Updates

Just shy of three months ago I was fighting with updating the iLO firmware (IPMI and extras) and as I recounted, even when you select downloads for RHEL (which is a supported operating system on those boxes), you’re given a Windows executable file, which you have to extract. But at least, you can use the file you extract, to update the IPMI firmware remotely.

Well, if it wasn’t for a small little issue that the fans are going to get stuck at 14 KRPM until you also update the BIOS. It wasn’t obvious how much of a problem that is until we got to the co-location last week and… “What on Earth is this noise?” “I think it’s our servers!” screamed on the backside of the cabinet.

Since one of the servers also had some other hardware issues (one of the loops that keep the chipset’s heatsink gave way — I glued it back and applied a new layer of thermal paste after scraping the old one), we ended up bringing it back to the office, where today, after repairing it and booting, it became obvious that we couldn’t leave it running at any time with that kind of noise. So it was time to update the BIOS. Which is easier said than done.

Step one is finding the correct download — the first one I found turned out to be wrong, but it took me some time to understand that, because BIOS update has to be done with DOS. And that brought me back to a very old post of mine (well, not that old, it’s just an year and a half ago, now that I see), and its follow-up which came with a downloadable 383KB compressed, 2GB uncompressed bootable FreeDOS image. Since getting sysrescuecd’s FreeDOS to do anything other than booting and playing its own demos was impossible.

So when I actually get to run the executable in the FreeDOS image … what I come to is an extremely stupid tool (that, I remember you, will not work on Windows XP, Vista or 7) to create an USB drive to update the BIOS… lol, whut?

The correct download is, once again, for Windows even when you select RHEL4, and it auto-extracts in a multitude of files that include the BIOS itself some four different times, and would provide some sort of network update, as well as “flat files” (which you can use with FreeDOS), a Windows updater, an ISO file, and an utility to build an USB stick to update the BIOS itself.

If you count the fact that this is for a server running Linux, you now just involved two more operating systems. And the next trip to the co-lo we’ve got work cut out for us, updating server by server the BIOS, and the IPMI firmware (hoping that the new firmware actually have a reliable SOL connection, among others).

But to avoid being all too negative with HP, it’s still better than trying to do standard sysadmin work on an Apple OS X Server install on a Mac Mini. OS X Server combines UNIX’s friendliness, with Windows’s remote management capabilities, Solaris’s hardware support, and AIX software availability. But that’s a topic for another post.

Updating HP iLO 2.x

As I wrote yesterday I’ve been doing system and network administration work here in LA as well, and I’ve set up Munin and Icinga to warn me when something required maintenance.

Now some of the first probes that Munin forwarded to Icinga we knew already about (in another post I wrote of how the CMOS battery ran out on two of the servers), but one was something that bothered me before as well: one of the boxes only has one CPU on board and it reports a value of 0 instead of N/A.

So I decided to look into updating the firmware of the DL140 G3 and see if it would help us at all; the original firmware on IPMI device was 2.10 while the latest one available is 2.21. Neither support firmware update via HTML. The firmware download, even when selecting the RedHat Enterprise Linux option is a Windows EXE file (not an auto-extract archive, which you can extract from Linux, but their usual full-fledged setup software to extract in C:SWSetup). When you extract it, you’re presented with instructions on how to build an USB key which you can then use to update the firmware via FreeDOS…

You can guess I wasn’t amused.

After searching around a bit more I found out that there is a way to update this over the network. It’s described in HP’s advanced iLO usage guide, and seems to work fine, but it also requires another step to be taken in Windows (or FreeDOS): you have to use the ROMPAQ.EXE utility to decompress the compressed firmware image.

*I wonder, why does HP provide you with two copies of the compressed firmware image, for a grand total of 3MB, instead of only one of the uncompressed one (2MB)? I suppose the origin of the compressed image is to be found in the 1.44MB floppy disk size limitation, but nowadays you don’t use floppies… oh well.*

After you have the uncompressed image, you have to set up a TFTP server.. which luckily I already had laying around from when I updated the firmware of the APC powerstrips discussed in one of the posts linked above. So I just added the IPMI firmware image, and moved on to the next step.

The next step consists of connecting via telnet to the box and issue two commands: cd map1/firmware1 followed by load -source //$serverip/$filename -oemhpfiletype csr … the file is downloaded via TFTP and the BMC rebooted. Afterwards you have to clear out the SDR cache of FreeIPMI as ipmi-sensors wouldn’t work otherwise.

This did fix the critical notification I was receiving .. to a point. First of all, the fan speed has still bogus thresholds (and I’m not sure if it’s a bug in FreeIPMI or one in the firmware at this point) as it reports the upper limits instead of the lower ones). Second of all the way it fixed the misreported CPU thermal sensor is by … not reporting any temperature off either thermal sensor! Now both CPU temperatures are gone and only ambient temperature is available. D’oh!

Another funky issue is that I’m still fighting to get Munin to tell Icinga that “everything’s okay” — the way Munin contacts send_nsca is connected to the limits so if there are no limits that are present, it seems like it simply doesn’t report anything at all. This is something else I have to fix this week.

Now back to doing the firmware updates on the remaining boxes…

Update: turns out HP updates are worse than the original firmware in some ways. Not only the CPU Thermal Diodes are no longer represented, but the voltages lost their thresholds altogether! The end result of which is that now it says that it’s all a-ok! Even if the 3V battery is reported at 0.04V!. Which basically means that I have to set my own limits on things, but at least it should work as intended afterwards.

Oh and the DL160 G6? First of all, this time the firmware update has a web interface… to tell it which file to request from which TFTP server. Too bad that all the firmware updates that I can run on my systems require the bootcode to be updated as well, which means we’ll have to schedule some maintenance time when I come back from VDDs.

Sysadmining tips from a non-sysadmin

I definitely am not a sysadmin, although as a Gentoo developer I have to have some general knowledge of sysadmining, my main work is development, and that’s what I have most of my experience with. On the other hand, I picked up some skills by maintaining the two VPSes (the one where this blog is hosted, and the one hosting xine’s bugzilla — as well as site).

Some of these tricks are related with the difficulties I have reported with using Gentoo as a guest operating system into virtual servers, but a few are totally not related. Let me try to relay some of the tricks I picked up.

The first trick is to use metalog for logging; while syslog-ng has some extra features missing in metalog (like the network logging support), for a single server, the latter is much much easier to set up and deal with. But I find the default configuration a bit difficult do deal with. My first step is then to replace the everything logging with a therest logging, by doing something along these lines:

Postgresql :
  program_regex = "^postmaster"
  program_regex = "^postgres"
  logdir   = "/var/log/postgres"
  break    = 1

Apache :
  program_regex = "^httpd"
  logdir   = "/var/log/http"
  break    = 1

The rest of important stuff :
  facility = "*"
  minimum  = 6
  logdir   = "/var/log/therest"

See that break statement? The whole point of it is to not fall back into the entries below in the file, at the end, the therest block will log everything that does not fall into previous directories. My reason to split these in this way is that I can easily check the logs for cron or postgresql and at the same time check if there is something I’m not expecting.

While using metalog drops the requirement for logrotate for the system logs, it doesn’t stop it to be needed for other log systems; quassel doesn’t log to syslog, nor does Portage, and the Apache access logs are better handled without using syslog to pass them through awstats later. Note: having portage to log to syslog is something I might make good use of; it would break qlop, but it might be worth it for some setups, like my two VPSes. But even with this limitation, metalog makes it much easier to deal with the basic logs.

The next step to simplify the management for me has been switching from Paul Vixie’s cron to fcron. The main reason is that fcron sounds “modern” compared with Vixie’s, and it has a few very useful features that makes it much easier to deal with: erroronlymail sends you mail about the cron jobs only if their status report is non-zero (failure) rather than every time if there is output; random makes it possible to avoid running heavy-handed jobs always at the same time (it makes the system altogether more secure, as an attacker cannot guess that at a given time the system will be having extra-load!), and lavg options allow you to skip running a series of jobs if the system is busy doing something else.

Oh and another important not for those of you using PostgreSQL; I learnt the hard way the other day that the default logging system of the PGSQL server is to write a postmaster.log file inside the PostgreSQL data directory. This file does not really need to be rotated as postgres seems to take care of that itself; on the other hand, it makes much more sense to leave the task to the best software: the logger! To fix this up you have to edit the /var/lib/postgresql/8.4/data/postgresql.conf file (you may have to change 8.4 to the version of PGSQL you’re running), and add the following line:

log_destination = 'syslog'

Thanks to metalog’s buffering and all its features, it should make it much easier on the I/O of the system especially if the load is very very high. Which sometimes happened to be when my blog went hammered.

Okay probably most of this is nothing useful to seasoned sysadmins, but small hints from non-sysadmin are something that gets useful for other non-sysadmin on the job for the same reason.

Remote debugging with GDB; part 1: Gentoo

In the last couple of days I had to focus my attention on just one issue, a work issue which I’m not going to discuss here since it wouldn’t make much sense either way. But this brought me to think about one particular point: remote debugging, and how to make it easier with Gentoo.

Remote debugging is a method commonly used when you have to debug issues on small, embedded systems, where running a full blown debugger is out of question. With GDB, this consists of running on the target system a tiny gdbserver proces, which can be controlled by a master gdb instance via either serial port or TCP connection. The advantages don’t stop at not needing to run the full blown debugger: the target system may also be equipped with stripped programs with no debug information, and keep the debug information entirely on the system where the master gdb instance is being executed.

This is all fine and dandy for embedded systems but it also does not stop to this, you can easily make use of this technique to debug issues on remote servers, where you might not want to upload all the debug symbols for the software causing the trouble, it then becomes a nice Swiss Army knife for system administrators.

There are a few issues you got to work with when you use gdbserver though, some of which I think should be improved in Gentoo itself. First of all, to debug a remote system that is running a different architecture than yours, you need a cross-debugger; luckily crossdev can build that for you, then you need the actual cross-built gdbserver. Unfortunately, even though the server is small and self-sufficient, it is currently only being built by sys-devel/gdb which is not so nice for embedded systems; we’d need a minimal or server-only USE flag for that package or even better a dev-util/gdbserver standalone package so that it could be cross-built and installed without building the full blown debugger which is not useful at all.

Then there is the problem of debug information. In Gentoo we already provide some useful support for that through the splitdebug feature, which takes all the debug information from an ELF file, executable or library, and splits it out in a .debug file that only contains the information useful during debugging. This split does not really help much on a desktop system, since the debug information wouldn’t be loaded anyway, my reasoning to have it separated is to make sure I can drop them all at once if I’m very short on space, without breaking my system. On the other hand, it is very useful to have it around for embedded systems for instance, although it currently is a bit clumsy to use.

Right now one common way to achieve proper archiving of debug information and stripping them in production is using the buildpkg feature together with splitdebug, and set up an INSTALL_MASK variable for /usr/lib/debug when doing the build of the root. The alternative is to simply remove that directory before preparing the tarball of the rootfs or stuff like that. This works decently, since the binary packages will have the full debug information, and you’d just need to reinstall the package you need debug information for without the INSTALL_MASK. Unfortunately this will end up replacing the files from the rest of the package, which is not so nice because it might change the timestamps on the filesystem, as well as wasting time, and eventually flash too.

This also does not play entirely nice with remote server administration. The server where this blog is hosted is a Gentoo vserver guest, it was installed starting from a standard amd64 stage, then I built a local chroot starting from the same stage, setting it up exactly as I wanted it to be; finally, I synced over the Portage configuration files, the tree and the binary packages built of all it, and installed them. The remote copy of the packages archive is bigger than the actual used packages, since it contains also the packages that are just build dependencies, but the overhead of this I can ignore without too much problems. On the other hand, if I were to package in all the debug information, and just avoid installing it with INSTALL_MASK, the overhead wouldn’t be this ignorable.

My ideal solution for this would involve making Portage more aware of the splitdebug feature, and actually split it out on package level too, similarly to what RPM does with the -dbg package. By creating a -debug or -dbg binpkg to the side of each package that would otherwise have /usr/lib/debug in it, and giving the user an option on whether to install the sub-package, it would be possible to know whether to merge on the root filesystem the debug information or ot, without using INSTALL_MASK. Additionally, having a common suffix for these packages would allow me to just ignore them when syncing them to the remote server, removing the overhead.

Dreaming a bit more, it could be possible to design multiple sub-package automatic generation, to resemble a bit what binary distributions like Debian and RedHat have been doing all these years, to split documentation in its -doc package, the headers and static libraries in -dev and so on. Then it would just require to give the user ability to choose which subpackages to install by default, and a per-package override. A normal desktop Gentoo system would probably want to have everything installed by default, but if you’re deploying Gentoo-based systems, you probably would just have a chroot on a build machine that does the work, and then the system would just get the subset needed (with or without documentation). Maybe it’s not going to be easy to implement, and I’m sure it’s going to be controversial, but I think it might be worth looking into it. Implementing it in a non-disruptive way (with respect to the average user and developer workflow) is probably going to make it feasible.

Tomorrow, hopefully, I’ll be writing some more distribution-agnostic instructions on how to remotely debug applications using GDB.

Running git-daemon under an unprivileged user

Some more notes about the setup I’m following for the vserver (where this blog and xine’s bugtracker’s run, the last one I think was similar, but about awstats. You can see the running thread: running stuff as user rather than root, especially if the user has no access to any particular resource.

So this time my target is git-daemon; I already set it up some time ago to run as nobody (just changed the start-stop-daemon parameters) but there is some problem with that: you either give access to the git repositories to world, or to nobody, which might be used by other services too. Also, there has been a lot of development in git-daemon in the past months, to the point that now it really looks like a proper daemon, included privilege dropping, which makes it very nice with respect to security issues.

So here we go with setting up git-daemon to run as unprivileged user; please note first of all that I’ll also apply along these lines the fix to bug 238351 (git-daemon reported crashed by OpenRC), since I’m doing the same thing locally and it looks quite nicer to do it at once rather than multiple times.

I’m also going to assume a slightly different setup than the one I’m using myself, as it’s the one I’d like to convert soon enough: multi-user setup, with multiple system users (not gitosis-based) being able to publish their GIT repositories. This means that together with a git-daemon user, we’re going to have a git-users group for the users who are able to publish (and thus write to) git repositories:

# useradd -m -d /var/lib/git -s /sbin/nologin git-daemon
# groupadd git-users
# chmod 770 /var/lib/git
# setfacl -d -m u:git-daemon:rx /var/lib/git
# setfacl -m g:git-users:rwx /var/lib/git
# setfacl -d -m:git-users:- /var/lib/git

This way, the users in the git-users group may create subdirectories (and files) in /var/lib/git, but only git-daemon (-and other git-users-) would be able to access them in reading (-the users can then further restrict the access by removing git-users access to their subdirectories, either directly or through a script-; edit: I modified the commands so that other git-users don’t have, by default, access to other users’ subdirectories, they can extend it further). Since most likely you want to provide gitweb (or equivalent) access to the repositories, you might want also to allow your webserver to access that; in my case, gitweb run as the lighttpd user (I haven’t got around to split the various CGIs yet), so I add this:

# setfacl -m u:lighttpd:rx /var/lib/git
# setfacl -d -m u:lighttpd:rx /var/lib/git

This way lighttpd will have access to /var/lib/git and by default to all the subdirectories created.

Now it’s time to set up the init script correctly, since the one that dev-util/git ships at the moment will execute git-daemon only as root. We’re going to change both the start and stop functions, and generically the init script too. The first step is to add a PIDFILE variable outside the functions, we’re going to call it /var/run/git-daemon.pid. The proper fix for the bug linked above allows an override for PIDFILE, but this is not important, to me at least, and should stay only in the official script. Of course /va/rrun is not accessible to the git-daemon user, but this is not the problem; since git-daemon can drop privileges by itself, it will create the pid file beforehand. In my original idea, I was thinking of using start-stop-daemon directly to change the git-daemon user so that it wouldn’t run at root at all, but since the wrapper is mostly a hack, to run as a different user, I decided (for now) to let git-daemon handle its privileges dropping. (It would be nice if some security people assessed what’s the best way to handle this to make it official policy to follow); in that case, a subdirectory to /var/run was needed, since it had to be accessed by an unprivileged user.

After adding the PIDFILE variable, you’ll have to change the invocation of start-stop-daemon; you have to remove the --background option to ssd and replace it with the --detach option to git-daemon itself, and add to the latter also the --pid-file, --user and --group options; you also need to add the --pidfile (look here: no dash!) option to both invocations of ssd (not to the git-daemon parameters!). At that point, just restart git-daemon and you’re done. The final script looks like this:

#!/sbin/runscript
# Copyright 1999-2008 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/dev-util/git/files/git-daemon.initd,v 1.3 2008/09/16 17:52:15 robbat2 Exp $

PIDFILE=/var/run/git-daemon.pid

depend() {
        need net
        use logger
}

start() {
        ebegin "Starting git-daemon"
                start-stop-daemon --start 
                --pidfile ${PIDFILE} 
                --exec /usr/bin/git -- daemon ${GITDAEMON_OPTS} 
                --detach --pid-file=${PIDFILE} 
                --user=git-daemon --group=nogroup
        eend $?
}

stop() {
        ebegin "Stopping git-daemon"
                start-stop-daemon --stop --quiet --name git-daemon 
                --pidfile ${PIDFILE}
        eend $?
}

I also dropped the --quiet option to ssd since I had to look for some error at startup some time ago (I also have –syslog option configured through the conf.d file).

Although most of these changes could be handled by conf.d alone, I think it’d be nice if the default script shipped with dev-util/git included not only the fix for the bug but also the privilege dropping: running git-daemon as root is, in my opinion, craziness.