Virtual rewiring, part two: the EC

In the previous post I explained what I want: to be able to use the caps lock key for Fn, at least for the arrow keys to achieve the page up/down, home and end keys (navigation keys).

After that post, I was provided a block schematics of my laptop identifying the EC in the system as an ITE IT8572. This is a bit unfortunate, because ITE is not known for sharing their datasheets easily, but at least I know that the EC is based on the Intel 8051 (also known as MSC-51), with a 64KiB flash ROM.

Speaking of the ROM, it’s possible to extract the EC firmware from the ASUS-provided update files Using (unmodified) UEFITool. Within the capsule, the EC firmware is the first padding entry, the non-empty one, you can extract with the tool, and then you have the actual ROM image file, that’s easy.

I was also pointed at Moravia Microsystems’ MCU 8051 IDE which is a fully-functional IDE for developing for 8051 MCUs. I submitted an ebuild for this while at 33C3, so that you can just emerge mcu8051ide to have a copy installed. It supports some optional runtime dependencies that I have not actually made optional yet. This IDE supports both the conversion of binary file to Intel HEX (why on Earth is Intel HEX still considered a good idea I’m not sure), disassembly of the binaries, and comes with its own (Tcl/Tk) assembler.

Unfortunately, this has not brought me quite as close as it might be expected knowing I have the firmware, a disassembler and an assembler. The reason is also not quite obvious either.

The first problem is that the IDE is unable to actually re-assemble the code it produces. Since disassembly (unlike decompilation) should be a lossless procedure, that was the first thing I tried, and it failed. There appears to be at least two big problems: the first is that the IDE does not have a configuration for a 64KiB ROM 8051 (even though that is the theoretical maximum size of the ROM for that device), and the other is that, since it does not have a way to define which part of the ROM are data and which ones are code, it disassemble the data in the ROM as instructions that are not actually valid for the base 8051 instruction set.

So, I decided to look into other options; unfortunately I found only a DJGPP-era disassembler – which produces what looks like a valid assembly file, but can’t be re-assembled – and a apparently promising Python-based one that failed to even execute due to a Python syntax error.

I have thus started working on writing my own, because why not, it’s fun, and it wouldn’t be the first time I go parsing instructions manually — though the last time, I was in high school and I wrote a very dumb 8086 emulator to try my homework out without having to wait in the queue at the lab for the horrible Rube Goldberg Machine we were using. This was some 15 years ago by now.

But back to present: to be able to write a proper disassembler that does not suffer the problems I noted above, I need to make sure I have a test that checks that re-assembling the disassembled code produces the same binary ROM as the source. Luckily, there is an obvious way to do so incrementally: you just emit every single byte of the ROM as a literal byte value. It’s not too difficult.

Except, which syntax do you use for that? The disassembler didn’t use any literal bytes (instead emitted extended instructions for bytes that would not otherwise be mapped in the base ISA), so I spent some time googling for 8051 syntax, and I found a few decent pointers but nothing quite right. From what I can tell, the SDCC assembler should accept the same syntax as Alan Baldwin’s assembler suite except for some of the more sophisticated instructions, as SDCC forked an earlier version of the same software. Even just opening the website should make it clear we’re talking serious vintage code here!

This syntax is also significantly different from the syntax used by MCU 8051 IDE, though. Admittedly, I was hoping to use the SDCC assembler for this (Baldwin’s is not quite obvious to build at first, as it effectively only provides .bat files for that) since that can be more easily scripted. The IDE is a Tcl/Tk full environment, and its assembler is very slow from what I can tell. Unfortunately, I have yet to find a way for the SDCC-provided assembler to produce any binary file. It’s all hidden behind flags and multi-level object files, sigh!

So I decided to at least make a file that assembles with the IDE. According to this page, the syntax should be quite simple:

LABEL: DB 2EH

The DB pseudo-instructions defining a literal byte or bytes. And that sounds exactly like what I need! So I just made my skeleton disassembler emit every byte with this syntax, and… it fails to compile. It looks like the IDE assembler only supports DB with decimal numbers, which makes them harder to read and match to the hexdump -C output I”ve been using to compare the binaries. Fixing that, also still made things not build right, but I have yet to look deeper into it.

Given that I’m at 33C3, and there was a talk about radare2 already (although I have not seen it yet, I’ll watch it at home), I decided to try using that, as it also already supports 8051, at least in theory. I say in theory because:

% radare2 -a 8051 ec212.bin
[0x00000000]> pd
*** invalid %N$ use detected ***
zsh: abort      radare2 -a 8051 ec212.bin

This is a known problem which is still unfixed, and that has been de-prioritized already, so if I want it fixed, I’ll have to fix it myself.

At this point, I have not much to work with. I started a very skeleton version of a disassembler, so I can start building the parsing I need. I have not done the paperwork yet to release it but I hope to do so soon, and develop it in the open as usual. I will also have to do some paperwork to submit a few fixes for MCU 8051 IDE, to support at least the basics of the ITE controller I have, guessed from the firmware itself, rather than with the datasheet, as I have no access to that as of yet.

If anybody knows anything I don’t and can point me to useful documentation, I’d really be happy to hear it.

Motherboard review: ASUS vs MSI

You may remember last year I bought a gamestation to play games at home (and that means running Windows on it). Last month, I had to do a relatively big change: replace the motherboard altogether. And since I now managed to compare two motherboards of about the same generation, I thought I can give a bit of a comparative review of the two.

My original motherboard was an ASUS X99-S (which right now has an absolutely crazy price!) which I coupled with an Intel 5930K (which is not sold anymore). The motherboard on paper is great, SATA3, m.2 and so on, and it may actually be good if it’s not a broken one, but mine clearly was.

The first glitch I noticed, but not paid enough attention to, was related to the USB 3 ports. While all the ports worked fine, I never managed to install the ASMedia drivers, even though the ASMedia controller was meant to be backing some of the ports, and SysRescCD was actually seeing them fine. This bothered me for a while when I had performance issues on one of my devices, but otherwise it seemed ok.

The second problem was tricky to pin down exactly if it was always there or if it was an update causing it. When I bought the Gamestation, the memory was expensive so I only got 32GB of it. A few months later, I had some spare pocket money (well, I got some bonuses that I wanted to exchange for some gratification) and bought 32GB more. Stupidly, I don’t remember if I checked if it worked fine, just trusted it. A few months later, while trying to do some big processing in Lightroom, I came to notice that Windows only saw half of the RAM. I thought it was a bad bank or something like that, but any combination of shuffling the RAM around would only have Windows seeing 32GB of it. Even though CPU-Z would see all eight banks in.

At that point, Nikolaj suggested it could be an ME problem, so I went on and re-flashed the BIOS from scratch with an SPI flash adapter, but that didn’t help. Re-seating the CPU also didn’t help. I was appalled, but it was not enough to replace the board just yet, so I put the extra RAM to the side and soldiered on. I was wrong.

Last November, literally the day after my birthday, I came back home from a trip and wanted to download some dozens of GB of pictures I took… and my computer wouldn’t boot. The bootcode showed the system blocked in a CSM (compatibility system mode) failure. Trying all the permutation of things to change helped nothing, so it was either the motherboard or the CPU — I took a bet on the motherboard given the previous history, and ordered a MSI X99 SLI Plus while I was in the US — it was significantly cheaper than in Europe.

My hunch was right and indeed, the new motherboard solved the problem. The specs between the two are about the same, actually, there is the same ASMedia USB controller, though this time the drivers install correctly, all the RAM is actually seen by the system now, and of course the computer boots. But this is just the very superficial look at it. There is something else.

Both ASUS and MSI provide software utilities for overclocking, as it is expected for motherboards designed for the Haswell-E family of processors. But the approach the two take is significantly different. ASUS encodes most of the logic in the software itself it appears, with their “DIP5” core, while MSI appears to keep it in the firmware (that also appears to make the boot process a bit slower).

ASUS utility pack is called “AISuite”, and the major version is tied to the board’s generation, version 3 for the X99 motherboards. While there has been at least one update since the time I bought the card, the last version released for the suite was on 2015-07-28. In addition to the overclocking UI, the suite includes a handful of other board-specific tools: one to set the bulk transfer mode sizes (to provide higher performance on USB3 non-UAS devices, not needed on Linux as the kernel does the right thing by default), one to allow faster charge on iPhone devices, and so on so forth. Some of this is actually quite useful, for instance the faster USB transfer actually is useful, although it also has the side effect of stopping WD SmartWave tools from recognizing the drive, and so break your backups if you decided to use WD’s own tool rather than Microsoft’s.

On the other hand, a release for the DIP5 core was released on 2016-06-29, to support the new CPUs — their 2011-3 socket is full-pin, which allowed them to support a further generation of CPUs with only firmware updates. This is effectively an update for the various drivers needed for the underlying overclocking system, as well as a complete overhaul of the Suite UI — which is likely due to actually applying a newer-generation Suite to the motherboard.

Unfortunately, the new Suite UI does not come with a new set of add-ons for charger, USB, etc. This would be okay, except the add-ons ABI changed: the moment you open the Suite app, you have to press Enter so many times, as it tries to fetch icon files that do not exist. Copying the old PNG files into the new path makes it stop throwing up these errors, but the UI clearly shows the wrong icons.

Oh and by the way, starting AISuite with a different motherboard causes Windows 10 to blue-screen. I know because after booting my gamestation with the new motherboard I was welcome by the blue screen of death and I had a sagging feeling of dismay, expecting the CPU to be broken instead (turns out no, it was all the AISuite’s fault).

What about MSI’s app then? Well, their approach appears to be significantly different: first of all the overclocking app only has the overclocking function — they rely on ASMedia’s own tooling and drivers for the USB bulk transfer reconfiguration, and provide an optional tool for the charging options. In the spirit of not reimplementing stuff, they also don’t require any new Windows driver for this, requiring you to install the Intel ME drivers instead… which was fun because the copy I had installed from before the motherboard replacement was newer than the one MSI provides on their website.

And this makes the MSI utility more interesting: last update 2016-12-06, since they use the same exact package for all their boards, it includes no board-specific features and no drivers, so updating it is significantly simpler for them.

The end result is that I’m fairly happy. MSI does not have the tons of crapware that ASUS appears to provide for their boards. They do come with a “Live Update” tool, which I wouldn’t trust, even though I have not tested. Too many of those apps have forgot to implement HTTPS, certificate validation or pinning, making them extremely risky to run, which is unfortunate.

An aside, when you replace the motherboard of your computer, most systems that use computer authorization will consider it a new computer. Including Microsoft’s own Windows 10 license handling, as the Windows 10 license is tied to a EFI variable, for what I remember.

Of all those systems, Microsoft’s was the easiest to deal with, though. The system booted as unactivated, and they do try to point you towards buying a new license, burying the right interface behind “Troubleshooting”, but once you say “I changed hardware recently”, it allows you to just replace the previous computer authorization with the current one.

Both Google Play Music and iTunes require authorizing an additional computer, and that makes it a problem if you are close to the limit (because then you may have to unauthorize them all and then re-authorize them. Stupid DRMs.

Virtually rewiring laptop keyboards

You may remember I had problems with my laptop a few months before, because it refused to boot until I unplugged the CMOS battery. This by the way happened again, to the point I need to remember to buy a new CMOS battery next time I’m in the States (the european prices are crazy insane, and I’ll be back reasonably soon). This is the start of a story for the same laptop, but it has nothing to do with the CMOS in this case.

I have recently replaced my work laptop with an HP Chromebook from the previous MacBook Pro I was using. If you’re curious for my reasons, they boil down to traveling too much, and the MBP being too heavy. I briefly considered an Air, but given the direction they go to, the Chromebook works better for the work needs.

If you didn’t know, Chromebooks don’t come (by default) with a Caps Lock key. Maybe it’s a public service, making it more difficult to shout on the Internet, maybe it’s because whoever designed the keyboards was nostalgic of the control key in place of the caps lock, I’m not sure. Instead of moving the control key, they introduced a new search button, which triggers the search box as well as function as a “Fn” modifier, to access features such as page up/down, home and end. I liked the approach and it’s actually fairly handy. Unfortunately it means that now I have a third way (in addition to the Asus and the Dell keyboards) to access these functionalities, which makes my muscle memory suffer badly. It also meant I kept typing all-caps on my Asus laphttps://lkml.org/lkml/2008/6/6/480top when I tried using the modifier (and failed) and that was pissing me off.

On Apple USB and Bluetooth keyboards there is a Fn button, but it’s handled entirely in software. Indeed if you have one such keyboard, particularly the 60% version (those without numpad and separate isles for movement keys), and you want to use it on Linux you need to enable a kernel module to implement the correct emulation. I know that because it bit me when they first introduced it, as I was using a full-size Apple keyboard instead, and the numlock emulation was making me unable to type.

This is give or take the way it works on the Chromebook, mostly out of necessity of sharing the Fn modifier with the Search button. And it allows you to change which key is Search/Fn in software, which is handy. Why can’t I do that with my Asus laptop? Well, I can disable the Caps Lock at least, and replace it with Control like so many people do already, after all I use Emacs and they tell me it’s much better to use Emacs that way (I don’t know about it, I tried it briefly, but my muscle memory works better with the pinky-control). But that’s not exactly what I want.

I could try remapping Ctrl+arrows to behave the same way as Fn+arrows but that’s not quite what I want either because then I lose the skip-ahead/forwards that I want from Ctrl+arrows. So I need to come up with alternatives. Much as I wish this was going to be a step-by-step procedure to fix this, it’s not, and it’s instead a musing of what may or may not work.

The first option would be to implement the Fn in software, either by the kernel, X11 or libinput level. This could actually be interesting to make the Fn behaviour of Apple keyboards generic enough. I don’t really know where to start with that one, because between systemd, libinput and Wayland the input layers flow changed so much that I’m completely lost.

The other option is more daring and possibly more interesting: rewiring the laptop keyboard by changing what the keys actually send over the PS/2 bus. As Hector suggested over twitter, the keyboard is handled as part of the Embedded Controller (EC) firmware, and it is not untold of modifying a laptop’s EC although a quick search doesn’t turn up anyone doing so on an Asus laptop to change the keyboard scancodes.

Does it mean I can do it? Does it mean I will? I’m not sure yet. Part of the problem is that playing around with an EC is the kind of thing that can easily brick your laptop, and this is currently my only Linux environment in which I do actual work. I could try to re-target my HTPC to be a workstation, and then hack on this laptop like it’s disposable, but the truth is that I spend enough time in the air that I really want to have a laptop, at least as a secondary system.

The first problem is figuring out how run the update. The first step would be figuring out where the EC firmware is. In Matthew’s posts, he found a promising area of the update file within the image, based off a size and the (known) EC firmware version. In my case I don’t have that luck, since the only version I can see from the Linux host is the BIOS revision, which is 219. On the other hand, if I look at the Asus download page versions 212 and 216 explicitly mention an EC firmware update, so it would at least make it easy to verify whether my guess is right if I am to guess which area of the firmware image is the EC firmware itself.

But it might be easier. UEFITool supports reading these update files, as they are AMI Aptio capsules, and it should be possible then to extract a listing of object trees and checksum that tells you what actually changed between two versions. Unfortunately that would only tell you what and not how, but it’s a starting point. Unfortunately, the documentation of the tool itself already points out that many AMI features are not implemented because of the author’s NDA. Of course the moment when you look for aptio capsule format you find a post by Nikolaj’s about the AFU utility.

This may be a throw-in post just to give a random idea, or it may follow up with more details, and maybe some code to get the list of changed files in the capsule, but I have not started on this yet and I’m not sure I’ll do. The tools are out there, and it would be an interesting game to play, the problem is, nowadays, mostly the time.

Of the two options, implementing the second Fn key (without changing the one that is there) is obviously the one that has the most potential to be useful: if it can be made generic enough, it can be used on any keyboard, laptop or not, and might allow simplifying the Fn key handling in the Apple keyboards, by moving it away from a Apple-specific driver. So if someone has ideas of where this should fit nowadays, I’m happy to hear about those.

Saving a non-booting Asus UX31A laptop

I have just come back from a long(ish) trip through UK and US, and decided it’s time for me to go back to some simple OSS tasks, while I finally convince myself to talk about the doubts I’m having lately.

To start on that, I tried to turn on my laptop, the Asus UX31A I got three and a half years ago. It didn’t turn on. This happened before, so I just left it to charge and tried again. No luck.

Googling around I found a number of people with all kind of problems about it, and one of them is something getting stuck at the firmware level. Given how I had found a random problem with PCIE settings in my previous laptop, that would make it reboot every time I turned it off, but only if the power was still plugged in, I was not completely surprised. Unfortunately following the advice I read (take off the battery and power over AC) didn’t help.

I knew it was not the (otherwise common) problem with the power plug, because when I plugged the cable in, the Yubikey Neo-n would turn on, which means power arrived to the board fine.

Then I remembered two things: one of the advices was about the keyboard, and the keyboard itself has had problems before (the control key sometimes would stop working for half an hour at a time.) Indeed, once I re-seated the keyboards’ ribbon cable, it turned on again, yay!

But here’s the other problem: the laptop would turn on, the caps-lock LED on and stay there. And even letting the main battery run out would not be enough to return it to working conditions. What to do? Well, I got a hunch, and turned out to be right.

One of the things that I tried before was to remove the CMOS battery — either I kept it out not long enough to properly clear, or something else went wrong, but it turned out that removing the CMOS battery allowed the system to start up — but that would mean no RTC, which is not great, if you start the laptop without an Internet connection.

The way I solved it was as follows:

  • disconnect the CMOS battery;
  • start up the laptop;
  • enter “BIOS” (EFI) setup;
  • make any needed change (such as time);
  • “Save and exit”;
  • let the laptop boot up;
  • connect the CMOS battery.

Yes this does involve running the laptop without the lower plate for a while, be careful about it, but to the other hand, it did save my laptop from being stomped on, on the ground out of sheer rage.

Hardware review: Asus WL-300NUL

Some people probably still remember that I used to have an absolute fear of flying and planes altogether. To the point that I have avoided going to the on-site interview of the company I’m now (years later) working for, because it would have taken place in California and I got scared. While I still do not like to travel, I’ve been traveling quite a bit in the past few years, not only back and forth between Venice and Los Angeles, but also within Europe and within other cities in the USA both last year and this.

In particular, TripIt is telling me I’m going to be away from home at least 41 days this year (and this is without including trips that are not scheduled yet, such as a visit back in Italy, and another trip to the United States in November). And most of them are not for personal reason (although some are, luckily). With all of this going on, I’ve started looking at any reasonably cheap option for me to reduce the pains of traveling.

One of these options came to me through a few colleagues, who presented me the Asus WL-330NUL — a tiny wireless router, the almost exact size of the Ethernet adapter that was bundled with my laptop, that provides you with your own, personal WiFi network, routed to another, less-private network, either wireless or wired. An absolute must if you spend a considerable amount of time in hotels.

First of all, the device itself is tiny, as I said it’s almost the exact size of my Ethernet adapter and it can replace it 100%. Indeed, the device has four interfaces (although not the proper term): USB (gadget), Ethernet and two wireless radios; the USB connection is used both for host connectivity and for power: if you connect the router to your computer via USB, it’ll present itself as a cdc_ether device, which Linux supports full well as if it was a standard Ethernet port — if possible, it’s better supported than some of the USB Ethernet adapters out there in the wild.

Once your computer sees the connection via Ethernet, the device itself can be configured to either use a wired or wireless upstream connection — if you choose to use a wired network, which is what I do, as I’ll explain in a moment, then this by itself is going to be already a replacement of the ethernet adapter; indeed at first the device will configure itself to be a simple bridge between USB and Ethernet, although that’s not what I use it for.

Once you configured the wired or wireless upstream connection, you can focus on setting up your own private WiFi network: the second radio can broadcast your own SSID and handle your own 802.11n network, protected with WPA for instance. Since you have a stable SSID/key combination, once you turn the device on, all your gadgets will connect to that network, without requiring manual, device-by-device, configuration.

Even better, since you’re now behind a router, for what the hotel or other provider is concerned, you have a single device: you consume a single IP and a single connection. For networks where you have to login separately for each device every 24 hours (or even every reconnection), this also means you only have to do it from one device, where it’s handy, and everything else will follow.

As I said above, my suggested approach is to always use the wired network if the hotel makes it available (most of the non-economy hotels do). The reason why I’m saying this is that it’s easy to misread the security implications of a device like this. While it is true that it can create your own private WiFi to then route to the hotel wireless, when you do so you add nothing to security, even if your WiFi is WPA2. The reason is simple: the public wireless network from the hotel is still completely unencrypted, so anybody eavesdropping can see what you’re doing, unless you’re using encrypted websites and even then part of your traffic can be inspected, such as which websites you’re consulting. If, on the other hand, you use the wired network, while not totally secure (the hotel and the provider can still see the non-encrypted connections), you’re still stopping a good bunch of people from gathering your data.

Finally, there is one more feature that is important if you travel a lot among hotels of respectable size: all of them use multiple access points for their WiFi networks, even though they broadcast the same SSID (and sometimes they don’t); these access point do not allow you to roam data across them, so if you have two devices, say a Nexus 7 and a Chromecast that you bring with you, they may not be able to talk to each other without a device like this, as they may end up on different APs, and unable to “see” each other on the network, or at least not consistently enough to stream from one to the other. Since with this device you can just connect all the gadgets at the same network and access point, your problem is then solved.

I’ve been using the device for ten days now on two hotels and two airports, and it’s definitely handy. I can’t complain about the range either: I’m now in Pittsburgh’s Bakery Square at the SpringHill Suites and my phone connected fine to it across the square in the Coffee Tree Roaster shop. Oh yeah and my room faces away from the square too.

Also, the power supply (by Asus!) that I bought last year (the original US one that I got with it just died on my, so I bought a different one) comes with a USB charging port by itself, which means I can just WiFi from my laptop even with a single power socket, freeing up the USB port (I only have two and one I use for my smartcard reader). I guess I could probably run this off my Anker battery but I have not tried that yet, as I somehow doubt that the airlines would be okay with me broadcasting my own WiFi on their planes. In any case, this is now part of my essential tools.

Why can’t I get easy hardware

When I bought my Latitude I complained that it seemed to me more and more like a mistake — until the kernel started shipping with the correct (and fixed) drivers, so the things that originally didn’t work right (the SD card reader, the shutdown process, the touchpad, …) started working quite nicely. As of September 2011 (one year and a quarter after I bought it), between Linux and firmware updates from Dell and Broadcom, the laptop worked almost completely — only missing par still is the fingerprint reader, which I really don’t care that much about.

Recently, you probably have seen my UEFI post where I complained that I couldn’t install Sabayon on the new Zenbook (which is where I’m writing from, right now, on Gentoo). Well, that wasn’t the only problem I got with this laptop, and I should really start reporting issues to the kernel itself, but in the mean time let me write down some notes here.

First off, the keyboard backlight is nice and all, but I don’t need it – I learnt to touch-type when I was eight – so it would just be a battery waste of time. While the keys are reported correctly, and upower supports setting the backlight, at least the stable version of KDE doesn’t seem to support the backlight setting. I should ask my KDE friends if they can point me in the right direction. Another interesting point is that while the backlight is turned on at boot, it’s off after suspension — which is probably a bug in the kernel, but it’s working fine for me.

Speaking about things not turning back on after suspension, the WLAN LED on the keyboard is not turning on, at resume. And related to that, the rfkill key doesn’t seem to work that well either. It’s not a big deal but it’s a bit bothersome, especially since I would like to turn off the bluetooth adapter only (and since that’s supposedly hardware-controlled, it should get me some more battery life).

The monitor’s backlight is even more troublesome: first problem is deciding who should be handling it — i’s either the ACPI video driver (by default), the ASUS WMI driver, or the Intel driver — of the three, the only one that make it work is the Intel driver, and I’m not even sure if that’s actually controlling the backlight or just the tint on the screen, even though, when set to zero, it turns the screen OFF, not just display it as black. It does make it bearable though.

The brightness keys on the keyboard don’t work, by the way, nor does the one that should turn on and off the light sensor — the latter, isn’t even recognized as a key by the asus-wmi driver, and I can’t be sure of the correct device ID that I should use to turn on/off said light sensor. After I hacked the driver to not expose either the ACPI or the WMI brightness interfaces, I’m able to set the brightness from KDE at least — but it does not seem to stick, if I take it down, and after some time it starts and gets back to the maximum (when the power is connected, at least).

And finally, there is the matter of the SD card reader. Yesterday I went to use it, and I found out that … it didn’t work. Even though it’s an USB device, it’s not mass-storage — it’s a Realtek USB MMC device, which does not use the standard USB interface for MMC readers at all! After some googling around, I found that Realtek actually released a driver for that, and after some more digging I found out that said driver is currently (3.7) in the staging drivers’ tree as a virtual SCSI driver (with its own MMC stack) — together with a PCI-E peer, which has been already rewritten for the next release (3.8) as three split drivers (a MFD base, a MMC driver, and a MemoryStick driver). I tried looking into working on porting the USB one as well, but it seems to be a lot of work, and Realtek (or rather, Realsil) seems to be already working on it to port it to the real kernel, so it might be worth waiting.

To be fair what dropped away the idea from me of working on the SD card driver is that to have an idea of what’s going on I have to run 3.8 — and at RC1 panics as soon as I re-connect the power cable. So even though I would like to find enough time to work on some kernel code, this is unlikely to happen now. I guess I’ll spend the next three days working on Gentoo bugs, then I have a customer to take care of, so this is just going to be dropped off my list quite quickly.

UEFI booting

Last Friday (Black Friday, since I was in the US this year), I ended up buying for myself an early birthday present; I finally got the ZenBook UX31A that I was looking at since September, after seeing the older model being used by J-B of VLC fame. Today it arrived, and I decided to go the easy route: I already prepared a DVD with Sabayon, and after updating the “BIOS” from Windows (since you never know), I wiped it out and installed the new OS on it. Which couldn’t be booted.

Now before you run around screaming “conspiracy”, I ask you to watch Jo’s video (Jo did you really not have anything with a better capture? My old Nikon P50 had a better iris!) and notice that Secure Boot over there works just fine. Other than that, this “ultrabook” is not using SecureBoot because it’s not certified for Windows 8 anyway.

The problem is not that it requires Secure Boot or anything like that but much more simply, it has no legacy boot. Which is what I’m using on the other laptop (the Latitude E6510), since my first attempt at using EFI for booting failed badly. Anyway this simply meant that I had to figure out how to get this to boot.

What I knew from the previous attempt is this:

  • grub2 supports UEFI both 32- and 64-bit mode, which is good — both my systems run 64-bit EFI anyway;
  • grub2 requires efibootmgr to set up the boot environment;
  • efibootmg requires to have access to the EFI variables, so it requires a kernel with support for EFI variables;
  • but there is no way to access those variables when using legacy boot.

This chicken-and-egg problem is what blown it for me last time — I did try before the kernel added EFI Stub support anyway. So what did I do this time? Well, since Sabayon did not work out of the box I decided to scratch it and I went with good old fashioned Gentoo. And as usual to install it, I started from SysRescueCD — which to this day, as far as I can tell, still does not support booting as EFI either. It’s a good thing then that Asus actually supports legacy boot… from USB drives as well as CDs.

So I boot from SysRescueCD and partition the SSD in three parts: a 200MB, vfat EFI partition; a root-and-everything partition; and a /home partition. Note that I don’t split either /boot or /usr so I’m usually quite easy to please, in the boot process. The EFI partition I mount as /mnt/gentoo/boot/efi and inside it I create a EFI directory (it’s actually case-insensitive but I prefer keeping it uppercase anyway).

Now it’s time to configure and build the kernel — make sure to enable the EFI Stub support. Pre-configure the boot parameters in the kernel, make sure to not use any module for stuff you need during boot. This way you don’t have to care about an initrd at all. Build and install the kernel. Then copy /boot/vmlinuz-* as /boot/efi/EFI/kernel.efi — make sure to give it a .efi suffix otherwise it won’t work — the name you use now is not really important as you’ll only be using it once.

Now you need an EFI shell. The Zenbook requires the shell available somewhere, but I know that at least the device we use at work has some basic support for an internal shell in its firmware. The other Gentoo wiki has a link on where to download the file; just put it in the root of the SysRescueCD USB stick. Then you can select to boot it from the “BIOS” configuration screen, which is what you’ll get after a reboot.

At this point, you just need to execute the kernel stub: FS1:EFIkernel.efi will be enough for it to start. After the boot completed, you’re now in an EFI-capable kernel, booted in EFI mode. And the only thing that you’re left to do is grub-install --efi-directory=/boot/grub/efi. And .. you’re done!

When you reboot, grub2 will start in EFI mode, boot your kernel, and be done with it. Pretty painless, isn’t it?

What the heck is up with hardware drivers download?

Today I’m fixing up yet another streamlined Windows XP CD for a friend of mine (it’s an original Windows as usual).

I have already wondered about some stuff with Windows drivers, but today it seems like stuff became even more hellish.

First, VIA stopped providing drives on Via Arena, now provide them on their site, and most importantly the download area does not work with Firefox; so I had to use Internet Explorer to download them. Way to go VIA!

Second, when I go to Asus website to download the driver for he motherboard, I’m given a captcha to complete. To download a frigging driver!

What the heck!

Driver hell — when will it stop?

To get some extra pocket money to spend in the everyday maintenance of my systems, I also ended up working on maintenance of Windows computers on a daily basis; it’s not extraordinarily bad, and it usually doesn’t take me more than a day for a single computer even if it’s the first time I see it (once I’ve seen it once, I already know what to expect).

Unfortunately, it’s not always feasible to convert people to Linux yet; although I think I might start soon enough at least with a few people whose only use of a computer is to “browse websites, send email, watch a movie from time to time”. To make the task easier I obviously set up systems with Firefox and Thunderbird, VLC and OpenOffice, so that at least some programs can be found on the ”new“ systems when they migrate.

Unfortunately, it seems like Windows, especially Windows XP, a lot of my customers have OEM licenses for, has become a driver hell just like it was in the old days. And vendors don’t seem to make that much easier. Most vendors providing complete systems tend not to care about their users enough to provide downloads for the drivers (they just tell you to use their recovery partition; guess what? that stuff often doesn’t work extremely well, if at all, and in one instance it was even mounted as a drive on the normal OS… which meant it was infected too!), and the components’ manufacturer have websites that calling complex would be euphemistic:

  • ATI/AMD website is a mess to navigate; while they do (or did) chipsets too, you cannot really find a “chipset drivers” section; if you have an older version of a motherboard that is supported by legacy drivers you’ve got to navigate at least four pages before you can find out!
  • Asus website is a mess of javascript; whenever you ask to download something you have to tell them the operating system you’re looking for… – even for BIOS updates – the window is centered on the screen and does not work on cellphones, and of course once I could have used a cellphone just fine if it wasn’t for that (given that Asus boards usually can update the bios through USB sticks); no matter that half the time, whatever operating system you select, the same stuff is given you;
  • Intel website is also a labyrinth; to download some driver you got to search for the right class of software, then decide you got one in particular, and it often proposes you two options, then you have to agree to the license and again click download… that does not download the thing but rather redirects you to a page that calls a javascript to download the file; such javascript can sometimes not work at all, so they provide you with the usual ”if the file does not download, click here“; but rather than being a direct link, it’s also a javascript function; checking the function, it lists a clear bouncer link (which you could download with wget, too!), but with a little more presence of mind, you can notice that the link is _provided as a GET-dparameter to the (dynamic, at this point) page on Intel’s server; much easier to copy that out and drop the rest I’d say;
  • Realtek’s website sometimes does not work properly; on the other hand they give you direct FTP links so once you know the FTP server you can find the drivers just fine avoiding the website; would have been nicer to split it down for driver type so that the listing wouldn’t take a few minutes, but I have to say is the system that works better; even if FTP does make me feel like we’re back in the early ‘90s;
  • almost all download sites tend to have pretty slow connections, or capped connections; I can understand Asus, Gigabyte and Realtek that have their main server in Taiwan or so it would seem, but what about Intel? Luckily at least ATI and nVidia (that have the biggest driver packs) have very fast servers.

Then there are other problems like trying to understand that ”ATI Technologies, Inc. SBx00 Azalia” is actually the name reported by lspci for a Realtek Azalia coded that needs the HDA drivers from Realtek; or trying to guess the driver version, or the driver’s name, from the downloaded files, that often enough don’t have any kind of naming or versioning scheme. Again ATI (for quite a long time) and nVidia (recently) solved this in a pretty nice way: thei use their logo for the install executable; this does not make it very manageable under Linux though, given that nautilus doesn’t show (yet) the PE icon (maybe I can modify it to load the PE file, and extract the icon?).

Let’s just hope that Microsoft’s moves with Vista and Windows 7 will be a trampoline for Linux for the masses; I sincerely count more on Microsoft’s changes than Google OS as I’ve noted since Vista already gave us something useful for Linux.

You shall not suspend

This puts the word «end» to my various tries to have suspend to ram working on Enterprise.

Last night I tried the openSUSE wiki page that Slobodan D. Sredojevic linked on his documentation, and while trying to suspend the system in single user mode, without X running or anything, I finally found what causes the freeze after a few seconds from resume: a machine check exception (MCE).

The output is quite clear from the kernel already, the problem is in hardware; mcelog (which is quite useless considering the MCEs don’t get logged after resumes, so if you didn’t copy the output manually you’re screwed) reports it as CPU 0 0 data cache STATUS 0 MCGSTATUS 0, that seems to point to a data cache corruption in the CPU… duh!

I also tried updating the BIOS, just to make sure, but there is no change (and it scared me because it change the Promise controller setup from standard to RAID, and it wasn’t of course able to find any RAID array defined). Seems like my hardware don’t support suspend to ram, for some reason.

Sigh.

Update: I also now tried checking (and fixing) my ACPI DSDT, but that didn’t help. I’m not sure if I should submit my fixes to ASUS, and ask them if they have any clue, but this motherboard (A8V Deluxe) is three years old now, and I’m not sure if they care at all about it anymore.