Birch Books: 8051 and Yak Shaving

I have previously discussed my choice of splitting the actuator board, pointing out I’ll probably try designing an alternative controller board using something like the Adafruit Feather M4 and writing the firmware with CircuitPython. Part of the reason for that is that it’s just easier, but part of it is because 8051 is an annoying platform to work with.

There are a few different compilers for this platform, but as far as I know, the only open-source and maintained one is SDCC the Small Device C Compiler. I hadn’t used this in forever, but I was very happy to see a new release this year, including C2X work in progress, and C11 (mostly) supported, so I was in high spirits when I started working on this.

A Worrying Demonstration

I started from a demo that was supposed to be written explicitly for the STC89. The first thing I noted was that the code does not actually match the documentation in the same page, it references a _sdcc_external_startup() function that is not actually defined. On the other hand it does not seem to be required. There’s other issues with the code, and for something that is designed to work with the STC89, it seems to be overly complicated. Let me try to dissect the problems.

First of all, the source code is manually declaring the “Special Feature Registers” (SFR) for the device. In this case I don’t really understand the point, since all of the declared registers are part of the base 8051 architecture, and would be already declared by any of the model-specific header files that SDCC provides. While the STC89 does have a number of registers that are not found otherwise, none of those are used here. In my code I ended up importing at89x52.h, which is meant for the Atmel (now Microchip) AT89 series, which is the closest header I found for the STC89. I have since filed a patch with a header written based on other headers and the datasheet.

Side note: the datasheet is impressive in the matter of detail. It includes everything you may want to know, including the full ISA description, and a number of example cases.

Once you have the proper definition of headers, you can also avoid a lot of binary flag logic — the most important registers on the 8051 chips are bit-addressable, and so you don’t need to remember how many bits you need to shift around for you to set the correct flag to enable interrupts. And while you may be worrying that using the bit-addressed register would be slower: no, as long as you’re changing fewer than three bits on a register at a time, setting them with the bit-addressed variant is the same or faster. In the case of this demo, the original code uses two orl instructions, each taking 2 cycles, to set three bits total — using the setb instruction, it’s only going to take 3 cycles.

Once you use the correct header (either my contributed stc89c51rc.h, the at89x52.h, or even the very generic 8052.h), you have access to other older-than-thirty-years features that weren’t part of the original 8051, but were part of the subsequent 8052, which both the STC89 and AT89 series derive off. One of these features, as even Wikipedia knows, is a third 16-bit timer. This is important to the demo, since it’s effectively just an example of setting up a timer to “[set] up and using an accurate timer”.

Indeed, the code is fairly complicated, as it configures the timer both in main() and in the interrupt handler clockinc(). The reason for that is that Timer 0 is configured in “Mode 0”: the timer register is configured as 13-bit (with the word TH0, TL0), its rollover causes an interrupt, but you need to reload the timer afterwards. The reason for that is that you need more than 8 bit to set the timer to fire at 1kHz (once every millisecond), and while Timer 0 supports “automatic reload”, it only supports 8-bit reload values — since it’s using TH0 for the reload value.

8052 derivative support a third timer (Timer 2), which is 16-bit, rather than 8- or 13-bit. And it supports auto-reload at 16-bit through RCAP2H, RCAP2L. The only other complication is that unlike Timer 0 and Timer 1, you need to manually “disarm” the interrupt flag (TF2), but that’s still a lot less code.

I found the right way to solve this problem on Google Books, on a book that does not appear to have an ebook edition, and that does not seem to be in print at all. The end result is the following, modified demo.

// Source code under CC0 1.0
#include <stdbool.h>
#include <mcs51/8052.h>

volatile unsigned long int clocktime;
volatile bool clockupdate;

void clockinc(void) __interrupt(5)
{
        TF2 = 0;  // disarm interrupt flag.
	clocktime++;
	clockupdate = true;
}

unsigned long int clock(void)
{
	unsigned long int ctmp;

	do
	{
		clockupdate = false;
		ctmp = clocktime;
	} while (clockupdate);
	
	return(ctmp);
}

void main(void)
{
	// Configure timer for 11.0592 Mhz default SYSCLK
	// 1000 ticks per second
	TH2 = (65536 - 922) >> 8;
	TL2 = (65536 - 922) & 0xFF;
        RCAP2H = (65536 - 922) >> 8;
        RCAP2L = (65536 - 922) & 0xFF;
	
        TF2 = 0;
        ET2 = 1;
        EA = 1;
        TR2 = 1; // Start timer

	for(;;)
		P3 = ~(clock() / 1000) & 0x03;
}

I can only expect that this demo was just written long enough ago that the author forgot to update it, because… the author is an SDCC developer, and refers to his own papers working on it at the bottom of the demo.

A Very Conservative Compiler

Speaking of the compiler itself, I had no idea of what a mess I would get myself into by using it. Turns out that despite the fact that this is de-facto the only opensource embedded compiler people can use for the 8051, it is not a very good compiler.

I don’t say that to drag down the development team, who are probably trying to wrestle a very complex problem space (the 8051’s age make its quirk understandable, but irritating — and the fact that there’s more derivatives than there’s people working on them, is not making it any better), but rather because it is missing so much.

As Philipp describes it, SDCC “has a relative conservative architecture” — I would say that it’s a very conservative architecture, given that even some optimisations that, as far as I can tell, are completely safe are being skipped. For example, doing var % 2 (which I was using to alternate between two test patterns on my LEDs) was generating code calling into a function implementing integer modulo, despite being equivalent to var & 1, which is implemented in the basic instructions.

Similarly, the compiler does not optimise division by powers-of-two ­— which means that for anything that is not a build-time constant you’re better off using bitwise operations rather than divisions — it’s another thing that I addressed in the demo above, even though there it does not matter, as the value is constant at build time.

Speaking of build-time constants — turns out that SDCC does not do constant propagation at all. Even when you define something static const, and never take its address, it’s emitted in the data section of the output program, rather than being replaced at build time where it’s used. Together with the lack of optimisation noted above, it meant I gave up on my idea of structuring the firmware in easily-swappable components — those would rely on the ability of the compiler to do optimisation passes such as constant propagation and inlining, but we’re talking about the lack of much lower level optimisation now.

Originally, this blog post also wanted to touch on the fact that the one library of 8051 interfaces I found hasn’t been touched in six years, has still a few failed merge markers, and not even parsing with modern SDCC — but then again, now that I know SDCC does not optimise even the most basic of operations, I don’t think using a library like that is a good idea — the IO module there is extremely complicated, considering that most ports’ I/O lines can be accessed with bit-addressed registers.

Now, as Andrea (Insomniac) pointed out, Philipp also has a document on using LLVM with SDCC — but the source code this is referencing is more than five years old, and relies on the LLVM C backend, which means it’s generating C code for SDCC to continue compiling. I do wonder if it would make sense to have instead a proper LLVM target for 8051 code — it’s beyond the amount of work I want to put on this project, but last year they merged AVR support into LLVM, which allows to use (or at least try) Rust on 8-bit controllers already. It would be interesting to see if 8051 cores could be used with something different than C (or manually written assembly).

You could wonder why am I caring this much for a side project MCU that is quite older than me. The thing is I don’t, really. I just seem to keep bumping around 8051/2 in various places. I nearly wrote a disassembler for it to hack at my laptop’s keyboard layout a few years ago. I still feel bad I didn’t complete that project. 8051 is still an extremely common micro in low-power applications, and the STC89 in particular is possibly the cheapest micro you can set up prototypes at home: you can get 20 of them for less than 60p each from AliExpress, if you have the time to wait — I know, I just ordered a lot, just to have them around if I decide to do more with them now that I sort-of understand them. the manufacturer appears to make many multiple variants of them still, and I would be extremely surprised if you didn’t have a bunch of these throughout your home, in computers, dishwashers, washing machines, mice, and other devices that just need some cheap and cheerful logic controller without breaking the bank. Heck, I expect them to be used in glucometers, too!

With all these devices tied to closed-source, proprietary compilers, I would feel more comfortable if there was some active work on supporting a modern compiler platform in the open source world as well. From my point of view, this sounds like the needs of the industrial users, and those of the hobbyist community, diverged very much on this topic.

Sum It All Up

So for my art project I decided that even SDCC is good enough, but I wanted to make sure I would not end up with broken code (which appears to happen fairly often), so I ended up reading the generated assembly code to make sure it made sense. Despite not being particularly familiar with 8051 ISA, thanks to the Wikipedia article and the detailed datasheet from STC, it wasn’t too hard to read through it.

While I was going through it, I also figured out how to rewrite parts of the C code to force SDCC to emit some decent code. For instance, instead of a branch that either adds 1 or 32 to a counter, I was better off making a temporary variable hold 1, or change it to 32, add that variable. The fact that SDCC couldn’t optimise that made me sad, but again it’s understandable given the priorities.

Hopefully I have kept the source code still fairly readable. You can check the history to see all the various things I kept changing to make it more readable in assembly as well. Part of the changes meant changing some of my plans. In my first notes I wanted to run through 20 “hours” configurations in 60 minutes — but to optimise the code I decided that it’ll run 16 “hours” in just over 68 minutes. That way I could use a lot of power-of-twos and do away with annoying calculations.

Birch Books MCU Selection

A couple of people have asked me why I started the art project down the path of using an 8051 MCU, which is a fairly old microcontroller (heck, I found out I looked at those chips back in 2006!), rather than using one of the more modern hacker/maker solutions such as Arduino. The answer I already gave in that post: I had it already here.

I bought a devkit for it hoping to be able to hack on the LED heart I bought as a surprise for my wife on Valentine’s day, which was centered around the same micro. Now, with hindsight, that was silly: the board was explicitly marked with an AT89S52 name, which is a much more common chip, and probably one for which I could have found a devkit/programmer in much shorter time, but it turned out to be a nice exercise nonetheless.

Indeed, I ended up having to learn a lot more about this chip, its programming, and refreshing my (terrible) electronics understanding. And while this has been breaking my brain at times, it also stretched it to learn something new. I guess I now know how my wife is feeling while learning Python coming from a humanities background.

I had another micro at home. Some time ago I wanted to figure out how to send a certain sequence of infrared commands to my TV via Google Assistant (it’s a long story, sometimes my TV doesn’t initialize the audio return channel correctly), and I ended up buying (but never using) an Adafruit Feather M4 and an AirLift FeatherWing. I soldered the terminals and made sure they worked, but only played with it briefly.

The Feather comes with CircuitPython, a MicroPython implementation firmware, which actually is fairly nice to write simple logic for the microcontroller, and is very easy to deploy: you just need to copy the Python files in the virtual USB flash drive that appears when you connect the board to the computer. It also includes a very nice interactive Python shell you can use to experiment without needing to commit to code (yet). And with the AirLift you also get support for controlling remotely via WiFi, and setting up all kind of request handling.

On the other hand, the 8051 is a fairly complicated tool. The ISA has not had any refresh since 1980 for what I can tell, and that’s on purpose: binary and pin compatibility appears to be the main advantage on using 8051 derivatives chips (or cores on FPGA). You’d think that with that having stayed the same we would have very advanced toolchains for it, but you’d be wrong. As far as I can tell, the only maintained open-source compiler for this is SDCC, and even that barely just. You might have seen my rants about this on Twitter, and if not, fear not: I’ll write a post about it next week.

So why did I go for the 8051, which is significantly older, harder to write code for, harder to program (you either need a devboard, or make sure you provide the right ISP headers on the board), and with quite a few question marks on its availability?

Well, the Feather only has a few really general purpose I/O lines. While both the M4 and the ESP32 supposedly should have enough GPIO lines, the Feather is a specific configuration, that commits a lot of lines to specific usage, such as an I²C/SPI bus to communicate with different Feathers. The usual answer to this is to include something like the MCP23017, which is an I/O expander that you drive via the I²C bus. But as it turns out, not only I don’t have one of those at home, but even Adafruit appears to only sell it on an Expander Bonnet for the Raspberry Pi. I’m not sure why there’s no FeatherWing with it, despite the fact that they document how to use one with CircuitPython, and while I’m sure I could design one, or look for an unofficial one, it’s something I don’t want to get to right now.

On the other hand, 8051 and its clones come with a lot more GPIO lines, and most of those are uncommitted if you start from nothing. The DIP-40 packages have 32 lines, and if you don’t need to use external memory, you have at the very least 16 uncommitted lines. Of the other 16 lines, some are shared with other functions, including external hardware interrupts, serial port, and most in-system programming interfaces.

Now, theoretically it seems like the ESP32 chip also have quite a few GPIO lines, although I only counted 14 uncommitted lines on their QFN packages. I guess you can scavenge a few more lines by not using some of the features, but that might end up conflicting with the MicroPython interfaces anyway.

So yeah I will probably eventually move to a different design that includes the MCP23017. Maybe I’ll end up designing a Feather Base (if not a proper FeatherWing) for it after all, to prototype with the already designed (and sent to fab) actuator board. But that’s a story for another time.