I have previously discussed my choice of splitting the actuator board, pointing out I’ll probably try designing an alternative controller board using something like the Adafruit Feather M4 and writing the firmware with CircuitPython. Part of the reason for that is that it’s just easier, but part of it is because 8051 is an annoying platform to work with.
There are a few different compilers for this platform, but as far as I know, the only open-source and maintained one is SDCC the Small Device C Compiler. I hadn’t used this in forever, but I was very happy to see a new release this year, including C2X work in progress, and C11 (mostly) supported, so I was in high spirits when I started working on this.
A Worrying Demonstration
I started from a demo that was supposed to be written explicitly for the STC89. The first thing I noted was that the code does not actually match the documentation in the same page, it references a _sdcc_external_startup()
function that is not actually defined. On the other hand it does not seem to be required. There’s other issues with the code, and for something that is designed to work with the STC89, it seems to be overly complicated. Let me try to dissect the problems.
First of all, the source code is manually declaring the “Special Feature Registers” (SFR) for the device. In this case I don’t really understand the point, since all of the declared registers are part of the base 8051 architecture, and would be already declared by any of the model-specific header files that SDCC provides. While the STC89 does have a number of registers that are not found otherwise, none of those are used here. In my code I ended up importing at89x52.h
, which is meant for the Atmel (now Microchip) AT89 series, which is the closest header I found for the STC89. I have since filed a patch with a header written based on other headers and the datasheet.
Side note: the datasheet is impressive in the matter of detail. It includes everything you may want to know, including the full ISA description, and a number of example cases.
Once you have the proper definition of headers, you can also avoid a lot of binary flag logic — the most important registers on the 8051 chips are bit-addressable, and so you don’t need to remember how many bits you need to shift around for you to set the correct flag to enable interrupts. And while you may be worrying that using the bit-addressed register would be slower: no, as long as you’re changing fewer than three bits on a register at a time, setting them with the bit-addressed variant is the same or faster. In the case of this demo, the original code uses two orl
instructions, each taking 2 cycles, to set three bits total — using the setb
instruction, it’s only going to take 3 cycles.
Once you use the correct header (either my contributed stc89c51rc.h
, the at89x52.h
, or even the very generic 8052.h
), you have access to other older-than-thirty-years features that weren’t part of the original 8051, but were part of the subsequent 8052, which both the STC89 and AT89 series derive off. One of these features, as even Wikipedia knows, is a third 16-bit timer. This is important to the demo, since it’s effectively just an example of setting up a timer to “[set] up and using an accurate timer”.
Indeed, the code is fairly complicated, as it configures the timer both in main()
and in the interrupt handler clockinc()
. The reason for that is that Timer 0 is configured in “Mode 0”: the timer register is configured as 13-bit (with the word TH0, TL0), its rollover causes an interrupt, but you need to reload the timer afterwards. The reason for that is that you need more than 8 bit to set the timer to fire at 1kHz (once every millisecond), and while Timer 0 supports “automatic reload”, it only supports 8-bit reload values — since it’s using TH0 for the reload value.
8052 derivative support a third timer (Timer 2), which is 16-bit, rather than 8- or 13-bit. And it supports auto-reload at 16-bit through RCAP2H, RCAP2L. The only other complication is that unlike Timer 0 and Timer 1, you need to manually “disarm” the interrupt flag (TF2), but that’s still a lot less code.
I found the right way to solve this problem on Google Books, on a book that does not appear to have an ebook edition, and that does not seem to be in print at all. The end result is the following, modified demo.
// Source code under CC0 1.0
#include <stdbool.h>
#include <mcs51/8052.h>
volatile unsigned long int clocktime;
volatile bool clockupdate;
void clockinc(void) __interrupt(5)
{
TF2 = 0; // disarm interrupt flag.
clocktime++;
clockupdate = true;
}
unsigned long int clock(void)
{
unsigned long int ctmp;
do
{
clockupdate = false;
ctmp = clocktime;
} while (clockupdate);
return(ctmp);
}
void main(void)
{
// Configure timer for 11.0592 Mhz default SYSCLK
// 1000 ticks per second
TH2 = (65536 - 922) >> 8;
TL2 = (65536 - 922) & 0xFF;
RCAP2H = (65536 - 922) >> 8;
RCAP2L = (65536 - 922) & 0xFF;
TF2 = 0;
ET2 = 1;
EA = 1;
TR2 = 1; // Start timer
for(;;)
P3 = ~(clock() / 1000) & 0x03;
}
I can only expect that this demo was just written long enough ago that the author forgot to update it, because… the author is an SDCC developer, and refers to his own papers working on it at the bottom of the demo.
A Very Conservative Compiler
Speaking of the compiler itself, I had no idea of what a mess I would get myself into by using it. Turns out that despite the fact that this is de-facto the only opensource embedded compiler people can use for the 8051, it is not a very good compiler.
I don't say that to drag down the development team, who are probably trying to wrestle a very complex problem space (the 8051's age make its quirk understandable, but irritating — and the fact that there's more derivatives than there's people working on them, is not making it any better), but rather because it is missing so much.
As Philipp describes it, SDCC "has a relative conservative architecture" — I would say that it's a very conservative architecture, given that even some optimisations that, as far as I can tell, are completely safe are being skipped. For example, doing var % 2
(which I was using to alternate between two test patterns on my LEDs) was generating code calling into a function implementing integer modulo, despite being equivalent to var & 1
, which is implemented in the basic instructions.
Similarly, the compiler does not optimise division by powers-of-two — which means that for anything that is not a build-time constant you're better off using bitwise operations rather than divisions — it's another thing that I addressed in the demo above, even though there it does not matter, as the value is constant at build time.
Speaking of build-time constants — turns out that SDCC does not do constant propagation at all. Even when you define something static const
, and never take its address, it's emitted in the data section of the output program, rather than being replaced at build time where it's used. Together with the lack of optimisation noted above, it meant I gave up on my idea of structuring the firmware in easily-swappable components — those would rely on the ability of the compiler to do optimisation passes such as constant propagation and inlining, but we're talking about the lack of much lower level optimisation now.
Originally, this blog post also wanted to touch on the fact that the one library of 8051 interfaces I found hasn't been touched in six years, has still a few failed merge markers, and not even parsing with modern SDCC — but then again, now that I know SDCC does not optimise even the most basic of operations, I don't think using a library like that is a good idea — the IO module there is extremely complicated, considering that most ports' I/O lines can be accessed with bit-addressed registers.
Now, as Andrea (Insomniac) pointed out, Philipp also has a document on using LLVM with SDCC — but the source code this is referencing is more than five years old, and relies on the LLVM C backend, which means it's generating C code for SDCC to continue compiling. I do wonder if it would make sense to have instead a proper LLVM target for 8051 code — it's beyond the amount of work I want to put on this project, but last year they merged AVR support into LLVM, which allows to use (or at least try) Rust on 8-bit controllers already. It would be interesting to see if 8051 cores could be used with something different than C (or manually written assembly).
You could wonder why am I caring this much for a side project MCU that is quite older than me. The thing is I don't, really. I just seem to keep bumping around 8051/2 in various places. I nearly wrote a disassembler for it to hack at my laptop's keyboard layout a few years ago. I still feel bad I didn't complete that project. 8051 is still an extremely common micro in low-power applications, and the STC89 in particular is possibly the cheapest micro you can set up prototypes at home: you can get 20 of them for less than 60p each from AliExpress, if you have the time to wait — I know, I just ordered a lot, just to have them around if I decide to do more with them now that I sort-of understand them. the manufacturer appears to make many multiple variants of them still, and I would be extremely surprised if you didn't have a bunch of these throughout your home, in computers, dishwashers, washing machines, mice, and other devices that just need some cheap and cheerful logic controller without breaking the bank. Heck, I expect them to be used in glucometers, too!
With all these devices tied to closed-source, proprietary compilers, I would feel more comfortable if there was some active work on supporting a modern compiler platform in the open source world as well. From my point of view, this sounds like the needs of the industrial users, and those of the hobbyist community, diverged very much on this topic.
Sum It All Up
So for my art project I decided that even SDCC is good enough, but I wanted to make sure I would not end up with broken code (which appears to happen fairly often), so I ended up reading the generated assembly code to make sure it made sense. Despite not being particularly familiar with 8051 ISA, thanks to the Wikipedia article and the detailed datasheet from STC, it wasn't too hard to read through it.
While I was going through it, I also figured out how to rewrite parts of the C code to force SDCC to emit some decent code. For instance, instead of a branch that either adds 1 or 32 to a counter, I was better off making a temporary variable hold 1, or change it to 32, add that variable. The fact that SDCC couldn't optimise that made me sad, but again it's understandable given the priorities.
Hopefully I have kept the source code still fairly readable. You can check the history to see all the various things I kept changing to make it more readable in assembly as well. Part of the changes meant changing some of my plans. In my first notes I wanted to run through 20 "hours" configurations in 60 minutes — but to optimise the code I decided that it'll run 16 "hours" in just over 68 minutes. That way I could use a lot of power-of-twos and do away with annoying calculations.
Thanks for your comments on SDCC. I’ve already fixed the reference to__sdcc_external_startup (an artifact from the SiLabs C8051) in the tutorial.
I am not an MCS-51 expert, and mostly work on other parts of SDCC. The mcs51 backend is the oldest backend in SDCC. While it still tends to work well (in terms of correctness), and benefits from improvements in the front-ends (such as improvements in standard-compliance) it clearly has fallen behind compared to the newer backends (compare e.g. the stm8 backend, where SDCC is quite competitive: http://www.colecovision.eu/stm8/compilers.shtml). Changes in the front-end often have not been met with corresponding changes in the mcs51 backend, resulting in some inefficiencies, and code size regressions (see e.g. https://sourceforge.net/p/sdcc/code/HEAD/tree/trunk/sdcc-extra/historygraphs/dhrystone-mcs51-size.svg and compare to https://sourceforge.net/p/sdcc/code/HEAD/tree/trunk/sdcc-extra/historygraphs/dhrystone-stm8-size.svg). Also, there is no ELF/DWARF support for mcs51 yet.
IMO the main problem is a decline in the number of users of the mcs51 backend: In the past, many vendors of 8051-compatible hardware recommended SDCC or supported it in their own IDEs. The alternative was mostly Keil, which was expensive. Many users meant a large pool of potential SDC developers.
This has changed, as Keil made deals with hardware vendors: Now vendors pay Keil to be able to give Keil licenses to users of their hardware. This has greatly reduced the visibility of SDCC, and thus the user base. The same problem also affected other free tools for MCS-51: For the older SiLabs C8051 series there are the ec2 tools to write programs onto Flash, that still work somewhat despite not having received much maintenance, but no such thing exists for the newer SiLabs EFM8 Bee series.
The few SDCC developers that are there work on it in their free time (as an exception I sometimes was able to use time from my university jobs, when working on some of the fancier stuff in SDCC, such as the new register allocator, that came with potential for publications – the mcs51 backend still uses the old register allocator though).
The MCS-51 is still a widespread architecture, and IMO free tools supporting it are an important part of the free software ecosystem. And Keil is quite a bit ahead in terms of the quality of the generated code (though in terms of standard compliance, they don’t even claim to support anything newer than the 1989 C standard support for which is far more incomplete in Keil than in SDCC).
Philipp (an SDCC developer)
P.S.: There are cheaper µC than the MCS-51 out there. E.g. the Padauk PMS15A at about 1 cent each.
P.P.S.: %2 is not the same as &1, /2 is not the same as >>1. They are only the same for unsigned variables (and for those SDCC does the optimization).
P.P.P.S.: I merged your patch with the STC89 header today. thanks. It is rare to see a mcs51 contribution to SDCC these days, most are z80 or stm8 now.
Thank you for the additional context! I was unaware of Keil’s practices and I have to say I’m not impressed by that 😣
As for %2, I’m fairly sure I saw it not being optimised on an unsigned variable. I don’t have cycles to revisit this project right now, but once the boards arrive and I assemble them I’ll try to go back and see if I can reproduce it at head, and if so send you a test case!