I have previously discussed my choice of splitting the actuator board, pointing out I’ll probably try designing an alternative controller board using something like the Adafruit Feather M4 and writing the firmware with CircuitPython. Part of the reason for that is that it’s just easier, but part of it is because 8051 is an annoying platform to work with.
There are a few different compilers for this platform, but as far as I know, the only open-source and maintained one is SDCC the Small Device C Compiler. I hadn’t used this in forever, but I was very happy to see a new release this year, including C2X work in progress, and C11 (mostly) supported, so I was in high spirits when I started working on this.
A Worrying Demonstration
I started from a demo that was supposed to be written explicitly for the STC89. The first thing I noted was that the code does not actually match the documentation in the same page, it references a _sdcc_external_startup()
function that is not actually defined. On the other hand it does not seem to be required. There’s other issues with the code, and for something that is designed to work with the STC89, it seems to be overly complicated. Let me try to dissect the problems.
First of all, the source code is manually declaring the “Special Feature Registers” (SFR) for the device. In this case I don’t really understand the point, since all of the declared registers are part of the base 8051 architecture, and would be already declared by any of the model-specific header files that SDCC provides. While the STC89 does have a number of registers that are not found otherwise, none of those are used here. In my code I ended up importing at89x52.h
, which is meant for the Atmel (now Microchip) AT89 series, which is the closest header I found for the STC89. I have since filed a patch with a header written based on other headers and the datasheet.
Side note: the datasheet is impressive in the matter of detail. It includes everything you may want to know, including the full ISA description, and a number of example cases.
Once you have the proper definition of headers, you can also avoid a lot of binary flag logic — the most important registers on the 8051 chips are bit-addressable, and so you don’t need to remember how many bits you need to shift around for you to set the correct flag to enable interrupts. And while you may be worrying that using the bit-addressed register would be slower: no, as long as you’re changing fewer than three bits on a register at a time, setting them with the bit-addressed variant is the same or faster. In the case of this demo, the original code uses two orl
instructions, each taking 2 cycles, to set three bits total — using the setb
instruction, it’s only going to take 3 cycles.
Once you use the correct header (either my contributed stc89c51rc.h
, the at89x52.h
, or even the very generic 8052.h
), you have access to other older-than-thirty-years features that weren’t part of the original 8051, but were part of the subsequent 8052, which both the STC89 and AT89 series derive off. One of these features, as even Wikipedia knows, is a third 16-bit timer. This is important to the demo, since it’s effectively just an example of setting up a timer to “[set] up and using an accurate timer”.
Indeed, the code is fairly complicated, as it configures the timer both in main()
and in the interrupt handler clockinc()
. The reason for that is that Timer 0 is configured in “Mode 0”: the timer register is configured as 13-bit (with the word TH0, TL0), its rollover causes an interrupt, but you need to reload the timer afterwards. The reason for that is that you need more than 8 bit to set the timer to fire at 1kHz (once every millisecond), and while Timer 0 supports “automatic reload”, it only supports 8-bit reload values — since it’s using TH0 for the reload value.
8052 derivative support a third timer (Timer 2), which is 16-bit, rather than 8- or 13-bit. And it supports auto-reload at 16-bit through RCAP2H, RCAP2L. The only other complication is that unlike Timer 0 and Timer 1, you need to manually “disarm” the interrupt flag (TF2), but that’s still a lot less code.
I found the right way to solve this problem on Google Books, on a book that does not appear to have an ebook edition, and that does not seem to be in print at all. The end result is the following, modified demo.
// Source code under CC0 1.0
#include <stdbool.h>
#include <mcs51/8052.h>
volatile unsigned long int clocktime;
volatile bool clockupdate;
void clockinc(void) __interrupt(5)
{
TF2 = 0; // disarm interrupt flag.
clocktime++;
clockupdate = true;
}
unsigned long int clock(void)
{
unsigned long int ctmp;
do
{
clockupdate = false;
ctmp = clocktime;
} while (clockupdate);
return(ctmp);
}
void main(void)
{
// Configure timer for 11.0592 Mhz default SYSCLK
// 1000 ticks per second
TH2 = (65536 - 922) >> 8;
TL2 = (65536 - 922) & 0xFF;
RCAP2H = (65536 - 922) >> 8;
RCAP2L = (65536 - 922) & 0xFF;
TF2 = 0;
ET2 = 1;
EA = 1;
TR2 = 1; // Start timer
for(;;)
P3 = ~(clock() / 1000) & 0x03;
}
I can only expect that this demo was just written long enough ago that the author forgot to update it, because… the author is an SDCC developer, and refers to his own papers working on it at the bottom of the demo.
A Very Conservative Compiler
Speaking of the compiler itself, I had no idea of what a mess I would get myself into by using it. Turns out that despite the fact that this is de-facto the only opensource embedded compiler people can use for the 8051, it is not a very good compiler.
I don't say that to drag down the development team, who are probably trying to wrestle a very complex problem space (the 8051's age make its quirk understandable, but irritating — and the fact that there's more derivatives than there's people working on them, is not making it any better), but rather because it is missing so much.
As Philipp describes it, SDCC "has a relative conservative architecture" — I would say that it's a very conservative architecture, given that even some optimisations that, as far as I can tell, are completely safe are being skipped. For example, doing var % 2
(which I was using to alternate between two test patterns on my LEDs) was generating code calling into a function implementing integer modulo, despite being equivalent to var & 1
, which is implemented in the basic instructions.
Similarly, the compiler does not optimise division by powers-of-two — which means that for anything that is not a build-time constant you're better off using bitwise operations rather than divisions — it's another thing that I addressed in the demo above, even though there it does not matter, as the value is constant at build time.
Speaking of build-time constants — turns out that SDCC does not do constant propagation at all. Even when you define something static const
, and never take its address, it's emitted in the data section of the output program, rather than being replaced at build time where it's used. Together with the lack of optimisation noted above, it meant I gave up on my idea of structuring the firmware in easily-swappable components — those would rely on the ability of the compiler to do optimisation passes such as constant propagation and inlining, but we're talking about the lack of much lower level optimisation now.
Originally, this blog post also wanted to touch on the fact that the one library of 8051 interfaces I found hasn't been touched in six years, has still a few failed merge markers, and not even parsing with modern SDCC — but then again, now that I know SDCC does not optimise even the most basic of operations, I don't think using a library like that is a good idea — the IO module there is extremely complicated, considering that most ports' I/O lines can be accessed with bit-addressed registers.
Now, as Andrea (Insomniac) pointed out, Philipp also has a document on using LLVM with SDCC — but the source code this is referencing is more than five years old, and relies on the LLVM C backend, which means it's generating C code for SDCC to continue compiling. I do wonder if it would make sense to have instead a proper LLVM target for 8051 code — it's beyond the amount of work I want to put on this project, but last year they merged AVR support into LLVM, which allows to use (or at least try) Rust on 8-bit controllers already. It would be interesting to see if 8051 cores could be used with something different than C (or manually written assembly).
You could wonder why am I caring this much for a side project MCU that is quite older than me. The thing is I don't, really. I just seem to keep bumping around 8051/2 in various places. I nearly wrote a disassembler for it to hack at my laptop's keyboard layout a few years ago. I still feel bad I didn't complete that project. 8051 is still an extremely common micro in low-power applications, and the STC89 in particular is possibly the cheapest micro you can set up prototypes at home: you can get 20 of them for less than 60p each from AliExpress, if you have the time to wait — I know, I just ordered a lot, just to have them around if I decide to do more with them now that I sort-of understand them. the manufacturer appears to make many multiple variants of them still, and I would be extremely surprised if you didn't have a bunch of these throughout your home, in computers, dishwashers, washing machines, mice, and other devices that just need some cheap and cheerful logic controller without breaking the bank. Heck, I expect them to be used in glucometers, too!
With all these devices tied to closed-source, proprietary compilers, I would feel more comfortable if there was some active work on supporting a modern compiler platform in the open source world as well. From my point of view, this sounds like the needs of the industrial users, and those of the hobbyist community, diverged very much on this topic.
Sum It All Up
So for my art project I decided that even SDCC is good enough, but I wanted to make sure I would not end up with broken code (which appears to happen fairly often), so I ended up reading the generated assembly code to make sure it made sense. Despite not being particularly familiar with 8051 ISA, thanks to the Wikipedia article and the detailed datasheet from STC, it wasn't too hard to read through it.
While I was going through it, I also figured out how to rewrite parts of the C code to force SDCC to emit some decent code. For instance, instead of a branch that either adds 1 or 32 to a counter, I was better off making a temporary variable hold 1, or change it to 32, add that variable. The fact that SDCC couldn't optimise that made me sad, but again it's understandable given the priorities.
Hopefully I have kept the source code still fairly readable. You can check the history to see all the various things I kept changing to make it more readable in assembly as well. Part of the changes meant changing some of my plans. In my first notes I wanted to run through 20 "hours" configurations in 60 minutes — but to optimise the code I decided that it'll run 16 "hours" in just over 68 minutes. That way I could use a lot of power-of-twos and do away with annoying calculations.