In the previous post I explained what I want: to be able to use the caps lock key for Fn, at least for the arrow keys to achieve the page up/down, home and end keys (navigation keys).
After that post, I was provided a block schematics of my laptop identifying the EC in the system as an ITE IT8572. This is a bit unfortunate, because ITE is not known for sharing their datasheets easily, but at least I know that the EC is based on the Intel 8051 (also known as MSC-51), with a 64KiB flash ROM.
Speaking of the ROM, it’s possible to extract the EC firmware from the ASUS-provided update files Using (unmodified) UEFITool. Within the capsule, the EC firmware is the first padding entry, the non-empty one, you can extract with the tool, and then you have the actual ROM image file, that’s easy.
I was also pointed at Moravia Microsystems’ MCU 8051 IDE which is a fully-functional IDE for developing for 8051 MCUs. I submitted an ebuild for this while at 33C3, so that you can just emerge mcu8051ide
to have a copy installed. It supports some optional runtime dependencies that I have not actually made optional yet. This IDE supports both the conversion of binary file to Intel HEX (why on Earth is Intel HEX still considered a good idea I’m not sure), disassembly of the binaries, and comes with its own (Tcl/Tk) assembler.
Unfortunately, this has not brought me quite as close as it might be expected knowing I have the firmware, a disassembler and an assembler. The reason is also not quite obvious either.
The first problem is that the IDE is unable to actually re-assemble the code it produces. Since disassembly (unlike decompilation) should be a lossless procedure, that was the first thing I tried, and it failed. There appears to be at least two big problems: the first is that the IDE does not have a configuration for a 64KiB ROM 8051 (even though that is the theoretical maximum size of the ROM for that device), and the other is that, since it does not have a way to define which part of the ROM are data and which ones are code, it disassemble the data in the ROM as instructions that are not actually valid for the base 8051 instruction set.
So, I decided to look into other options; unfortunately I found only a DJGPP-era disassembler – which produces what looks like a valid assembly file, but can’t be re-assembled – and a apparently promising Python-based one that failed to even execute due to a Python syntax error.
I have thus started working on writing my own, because why not, it’s fun, and it wouldn’t be the first time I go parsing instructions manually — though the last time, I was in high school and I wrote a very dumb 8086 emulator to try my homework out without having to wait in the queue at the lab for the horrible Rube Goldberg Machine we were using. This was some 15 years ago by now.
But back to present: to be able to write a proper disassembler that does not suffer the problems I noted above, I need to make sure I have a test that checks that re-assembling the disassembled code produces the same binary ROM as the source. Luckily, there is an obvious way to do so incrementally: you just emit every single byte of the ROM as a literal byte value. It’s not too difficult.
Except, which syntax do you use for that? The disassembler didn’t use any literal bytes (instead emitted extended instructions for bytes that would not otherwise be mapped in the base ISA), so I spent some time googling for 8051 syntax, and I found a few decent pointers but nothing quite right. From what I can tell, the SDCC assembler should accept the same syntax as Alan Baldwin’s assembler suite except for some of the more sophisticated instructions, as SDCC forked an earlier version of the same software. Even just opening the website should make it clear we’re talking serious vintage code here!
This syntax is also significantly different from the syntax used by MCU 8051 IDE, though. Admittedly, I was hoping to use the SDCC assembler for this (Baldwin’s is not quite obvious to build at first, as it effectively only provides .bat
files for that) since that can be more easily scripted. The IDE is a Tcl/Tk full environment, and its assembler is very slow from what I can tell. Unfortunately, I have yet to find a way for the SDCC-provided assembler to produce any binary file. It’s all hidden behind flags and multi-level object files, sigh!
So I decided to at least make a file that assembles with the IDE. According to this page, the syntax should be quite simple:
LABEL: DB 2EH
The DB
pseudo-instructions defining a literal byte or bytes. And that sounds exactly like what I need! So I just made my skeleton disassembler emit every byte with this syntax, and… it fails to compile. It looks like the IDE assembler only supports DB
with decimal numbers, which makes them harder to read and match to the hexdump -C
output I”ve been using to compare the binaries. Fixing that, also still made things not build right, but I have yet to look deeper into it.
Given that I’m at 33C3, and there was a talk about radare2 already (although I have not seen it yet, I’ll watch it at home), I decided to try using that, as it also already supports 8051, at least in theory. I say in theory because:
% radare2 -a 8051 ec212.bin
[0x00000000]> pd
*** invalid %N$ use detected ***
zsh: abort radare2 -a 8051 ec212.bin
This is a known problem which is still unfixed, and that has been de-prioritized already, so if I want it fixed, I’ll have to fix it myself.
At this point, I have not much to work with. I started a very skeleton version of a disassembler, so I can start building the parsing I need. I have not done the paperwork yet to release it but I hope to do so soon, and develop it in the open as usual. I will also have to do some paperwork to submit a few fixes for MCU 8051 IDE, to support at least the basics of the ITE controller I have, guessed from the firmware itself, rather than with the datasheet, as I have no access to that as of yet.
If anybody knows anything I don’t and can point me to useful documentation, I’d really be happy to hear it.