Flashing ESPHome on White Label Smart Plugs

You may remember that I bought a smart plug a few years ago, for our Christmas tree, and had been wondering what the heck do you use a smart plug for. Well, a few months ago I found out another use case for it that I had not thought about to that point: the subwoofer.

When I moved to Dublin, nearly eight years ago now, I bought a new TV, and an AV receiver, and a set of speakers — and with these last one, came a subwoofer. I don’t actually like subwoofer that much, or at least I didn’t, because the low-frequency sounds give (or gave) me headaches, and so while it does sound majestic to watch a good movie with the subwoofer on, I rarely do that.

On the other hand, it was still preferable to certain noise coming from the neighbours, particularly if we ended up not being in the same room as the device itself. Which meant we started using the subwoofer more — but unlike the AV receiver, that can be controlled with a network endpoint, the subwoofer has a physical switch to turn on and off… a perfect use case for setting this up with a smart plug!

Since I didn’t want to reuse the original plug for the Christmas tree (among other things, it’s stashed away with the Christmas tree), I ended up going online and ordering a new one — and I selected a TP-Link, that I thought would work just the same as my previous one… except it didn’t. Turns out TP-Link has two separate Smart Home brands, and while they share the same account management, they use different apps and have different integration support. The Christmas tree uses a Kasa-branded plug, while the one I ordered was a Tapo-branded plug — the latter does not work with Home Assistant, which was a bit annoying, while the former worked well, when I used it.

When TP-Link then also got in the news for having “fixed” a vulnerability that allowed Home Assistant to control the Kasa plugs on the local network, I got annoyed enough that I decided to do something about it. (Although it’s good to note that nowadays Home Assistant works fine with the “fixed” Kasa plugs, too.)

I asked once again help to Srdjan who has more patience than me to look up random IoT projects out there, and he suggested me the way forward… which turned out a bit more tortuous than expected, so let me try to recount it here, showing shortcuts as much as I can at the time of writing

The Hardware

Things start a bit iffy to begin with — I say that because the hardware I ended up getting myself was Gosund SP111 from Amazon, after checking reviews that they were still shipping an older version of the hardware that can be converted with tuya-convert (I’ll get back to the software in a moment). There’s plenty of warnings about the new models, but in a very similar fashion to the problem with CGG1 sensors, there’s no way to know beforehand what version of a firmware you’re going to find.

But in particular, SP111 are not Gosund engineering’s own work. Tuya is a company that build “white label” goods for many different brands, and that’s how these plugs were made. You can find them under a number of different brand names, but they all share the same firmware and, critically, the same pairing and update keys.

One of the best part about these Tuya/Gosund plugs is that they are tiny in the sense of taking very little space. They are less than half the size of my first Kasa-branded TP-Link smart plug, and even when compared with the newer Tapo-branded one they are quite slimmer. This makes for adding them easily to places that are a bit on the constrained side.

The Firmware

I did not buy these devices to run with the original firmware. Unlike the CGG1 and pretty much everything else I’m running at home, this time I went straight for the “I want to run ESPHome on them.” The reason for that was on one side, I’m annoyed and burnt by the two TP-Link devices, and from the other, the price of Hue-compatible smart plugs was annoyingly high. That would have been my default alternative.

Of the various important things to me, Home Assistant support grown from a “meh, sure” to “yeah I want that” — and in part because I did manage to set up scripts for quite a bit of tools at home through it. One of my original concerns (wanting to still be able to control most of these features by voice) is taken care of by signing up for Nabu Casa, which provides integration for both Google Assistant and Alexa, so even with the plug running a custom and maybe janky, firmware, getting it to work with the rest of the house would be a breeze.

There’s other options for open source firmware for the SP111, as well as other smart plugs. Tasmota is one of the most commonly used for this, and there’s another called ESPurna. I have not spent time investigating them, because I already have one ESPHome device running, and it was easier to lower my cognitive load to use something I already kinda know. After all, the plan is to replace the Kasa plug once we take out the Christmas tree, which would remove not one but two integrations from Google and Amazon (and one from Home Assistant).

All of these options can be flashed the first time around with tuya-convert, even though the procedure ended up not being totally clear to me — although in part that was caused by my… somewhat difficult set up. This was actually part of the requirements. Most smart home devices appear to be built around ESP8622 or ESP32, which means you can, with enough convincing, flash them with ESPHome (I’m still convincing my acrylic lamp circuit board), but quite a few require you have physical access to the serial port. I wanted something I could flash without having to take it apart.

The way this works, in very rough terms, is that the Tuya-based devices can be tricked into connecting to a local open network, and from there with the right protocol, they can be convinced to dump their firmware and to update it with an arbitrary firmware binary. The tuya-convert repository includes pretty much all the things you need to set this up, neatly packaged in a nearly user-friendly way. I say nearly, because as it turns out there’s so much that can go wrong with it, that the frustration is real.

The Process.

Part 1: WiFi.

First of all, you need a machine that has a WiFi adapter that supports Access Point mode (AP mode/hostapd mode, those are the keywords to look for it). This is very annoying to know for sure, because manufacturers tend to use the same model number across hardware revisions (that may entirely change the chipset) and countries — after all, Matthew ended up turning to Xbox One controller adapters! (And as it turns out, he says they should actually support that mode, with a limited range.)

The usual “go-to” for this is to use a laptop which also has an Ethernet port. Unfortunately I don’t have one, and in particular, I don’t have a Linux laptop anymore. I tried using a couple of Live Distros to set this up, but for one reason or another they were a complete bust (I couldn’t even type on Ubuntu’s Terminal, don’t even ask.)

In my case, I have a NUC on top of my desk, and that’s my usual Linux machine (the one I’m typing on right now!) so I could have used that… except I did disable the WiFi in the firmware when I got it, since I wired it with a cable, and I didn’t feel the need to keep it enabled. This appears to have been a blessing in disguise for a while, or I would have been frustrated at one of the openSUSE updates in December, when a kernel bug caused the system to become unusable with the WiFi on. Which is what happened when I turned the WiFi on in the firmware.

Since I don’t like waiting, and I thought it would generally be a good idea to have at least one spare USB WiFi dongle at home (it would have turned useful at least once before), I went and ordered one on Amazon that people suggested might work. Except it probably got a hardware revision in the middle, and the one I received wasn’t suitable — or, some of the reports say that it depends on the firmware loaded on it; I really didn’t care to debug that too much once I got to that stage.

Fast forward a few weeks, the kernel bug is fixed, so I tried again. The tuya-convert script uses Docker to set up everything, so it sounded like just installing Docker and docker-compose on my openSUSE installation would be enough, right? Well, no. Somehow the interaction of Docker and KVM virtual machines had side effects on the networking, and when I tried I both lost connectivity to Home Assistant (at least over IPv6), and the tuya-convert script kept terminating by itself without providing any useful information.

So instead, I decided to make my life easier and more difficult at the same time.

Part 1.1: WiFi In A Virtual Machine

I didn’t want Docker to make a mess of my networking setup. I also wasn’t quite sure of what tuya-convert would be doing on my machine (yes, it’s open source, but hey I don’t have time to audit all of it). So instead of trying to keep running this within my normal openSUSE install, I decided to run this in a virtual machine.

Of course I need WiFi in the VM, and as I said earlier, I couldn’t just pass through the USB dongle, because it wouldn’t work with hostapd. But modern computers support PCI pass-through, when IOMMU is enabled. My NUC’s WiFi supports hostapd, and it’s sitting unused, since I connect to the network over a cable.

The annoying part was that for performance issues, IOMMU is disabled by default, at least for Intel CPUs, so you have to add intel_iommu=on for the option of passing through PCI devices to KVM virtual machines to be available. This thread has more information, but you probably don’t need all of it, as that focuses on passing through graphic cards, which is a much more complicated topic.

The next problem was what operating system to run in the VM itself. At first I tried using the LiveDVD of openSUSE — but that didn’t work out: the Docker setup that tuya-convert uses is pretty much installing a good chunk of Ubuntu, and it runs out of memory quickly, and when it does it throws a lot of I/O errors from the btrfs loaded into memory. Oops.

Missing a VM image of openSUSE Tumbleweed, I decided to try their image for JeOS instead, which is a stripped down version meant to be used in virtualized environments. This seemed to fit like a glove at first, but then my simplicity plans got foiled by not realizing that usually virtualized environments don’t care for WiFi. Although the utter lack of NetworkManager or any other WiFi tooling turned out to be handy to make sure that nothing tried to steal the WiFi away from tuya-convert.

In addition to changing the kernel package with a version that cares about WiFi drivers, you need to install the right firmware package for the card. After that, at least the first part is nearly taken care of — you will most likely need a few more tools, such as Git and your editor of choice. And of course, Docker and docker-compose.

And then, do yourself a favour, and turn off firewalld entirely in your virtual machine. Maybe I should have said earlier “Don’t let the VM be published to the Internet at large”, but I hope if you get to try to pass-through a WiFi device, you knew better than doing that anyway. The firewall is something that is not obvious is going to get in your way later when you set to run tuya-convert, and it’ll make the whole setup fail silently in the hardest way possible to debug.

Indeed, when I looked for my issues I found tons of posts, issues, blogs all complaining about the same symptom I had, which was all caused by having a firewall in place. The tuya-convert script does a lot of things to set up stuff, but it can’t easily take down a firewall, and that is a biggie.

Indeed, and I’ll repeat that later, at some point the instructions tell you to connect some other device to their network and suggest otherwise it might not be working. My impression is that this is done because if it doesn’t work, you shouldn’t try taking the next steps yet. But the problem is that there is no note anywhere to help you if it doesn’t work — and the reason for it failing is likely the firewall stopping the DHCP server from receiving the requests. Oops.

Part 2: The Firmware Blob

ESPHome configurations are… sometimes very personal. I have found one for the SP111 on the Home Assistant forums and adapted it, but… I don’t really feel like recommending that one. So I’m afraid I won’t take responsibility for how you configure your ESPHome firmware for the plug.

Also, once you have ESPHome on the device, changing the config is nearly trivial, from the Home Assistant integration, so I feel it’s important to have something working at first, and then worry about perfecting it.

I think someone will be confused here on why am I jumping on configuring the firmware blob before we got to convert the device to use it. The reason for that is that you want to have the binary file (either built locally or generated with the Home Assistant integration and downloaded), and you put it into tuya-convert/files/, you will be able to directly flash that version, without going through the intermediate step of using one of the bundled firmware just to be able to update to an arbitrary firmware. But to do that, it needs to happen before you complete the Docker setup.

So, find yourself a working config for the device on the forums (and if you find one that is maintained and templated, so that one can just drop it in and just configure the parameters, please let me know), and generate your ESPHome firmware from there.

Also note that the firmware itself identifies the specific device. This means you cannot flash more than one device with the same firmware or you’ll have quite the headache to sort them out afterwards. Not saying it isn’t possible, but I just found it easier to make the firmware for the devices I was going to flash, and then load each one. As usual, my favourite tool to remember what is what would be my label maker, so that I don’t mix up which one I flashed with which binary.

Part 3: The Conversion

Okay so here’s the inventory of what you should have by this point before we move on to the actual conversion:

  • a virtual machine with a passed-through WiFi card that is supported by hostapd;
  • an operating system in the VM with the drivers for the WiFi, Docker, docker-compose, and no firewall;
  • a checkout of tuya-convert repository;
  • one or more ESPHome firmware binary files (one per device you want to flash), in the files/ directory of the checkout.

Only at this point you can go and follow the instruction of tuya-convert: create Docker image, setup docker-compose, and run the image. The firmware files need to be added before creating the docker image, because docker-compose does not bind the external files/ directory at all.

Once the software starts, it’ll ask you to connect another device that is not the plug to the WiFi. I’m not entirely sure if it’s just for diagnostics, but in either case, you should be able to connect to the network — the password should be flashmeifyoucan, although I don’t think I’ve seen that documented anywhere except when googling around for other people having had issues with their process.

If you try this from your phone, you should be prompted to login into the WiFi network through a captive portal — the portal is just a single page telling you that the setup is completed. If your phone gets stuck in the “Obtaining IP Address” phase, like mine did, make sure you really took down the firewall. This got me stuck for a while because I thought that the Docker itself controlled the whole firewall settings — but that does not appear to be the case.

Final Thoughts

I guess that this guide is not going to be very useful, with the new versions of the SP111 not being supported by tuya-convert (and not clear if it can be supported), but since I have two plugs still unused, it helps me to have written down the process to avoid getting myself stuck again.

The plugs appear to have configurable sensors for voltage, amperage, and total wattage used — and the configuration of those is why I’m not comfortable sharing the config I’m using: I took someone’s from a forum post but I don’t quite agree with some of the choice made, some of the values appear fairly pointless to me.

Voltage monitoring would have been an useful piece of information when I was back in Italy — those who read the blog a long time ago might remember that the power company over there didn’t really have any decent power available. Over here it feels like it’s very stable, so I doubt we’ll notice anything useful with these.

Having Smart Home devices that don’t rely on cloud services is much more comfortable than otherwise. I do like the idea of being able to just ask one of the voice assistants to turn off the subwoofer while I’m playing Fallout 76, for sure — but it’s one thing to have the convenience, and another to depend on it to control it. And as I said some time ago, I disagree with the assertion that there cannot be a secure and safe IoT Smart Home (and yes, “secure” and “safe” are two separate concepts).

As for smart plugs in particular? I’m still not entirely sold, but I can see that there definitely are devices where trying to bring the smart in the device is unlikely to help. Not as many though — it’s still a problem to find something that cannot be served better by more fine-grained control. In the case of the subwoofer, most of the controls (volume, cross-over, phase) are manual knobx on the back of the device. Would it have made sense to have a “smart subwoofer” that can tweak all of those values from the Home Assistant interface? I would argue yes — but at the same time, I can see in this case an expense of £10 for a smart plug beats the idea of replacing the subwoofer entirely.

I honestly have doubts about the Christmas tree lights as well. Not that I expect to be able to control them with an app, but the “controller” for them seems to be fairly standard, so I do expect if I search AliExpress for some “smart” controller for those I will probably find something — the question is whether I would find something I can use locally without depending on an external cloud service from an unknown Chinese brand. So maybe I’ll go back to one of my oldest attempts at electronics (13 years ago!) and see what I can find.

By the way, if you’re curious what else I am currently planning to use these smart plugs on… I’m playing with the idea of changing my Birch Books to use 12V LEDs – originally meant for Gunpla and similar models – and I was thinking that instead of leaving it always-on, I can just connect it with the rest of the routines that we use to turn the “living” items on and off.

Kind Software

This post sprouts in part from a comment in my previous disclaim of support to FSFE, but it’s a standalone post, which is not related to my feelings towards FSFE (which I already covered elsewhere). It should also not be a surprise to long time followers, since I’m going to cover arguments that I have already covered, for better or worse, in the past.

I have not been very active as a Free Software developer in the past few years, for reasons I already spoke about, but that does not mean I stopped believing in the cause or turned away from it. At the same time, I have never been a fundamentalist, and so when people ask me about “Freedom 0”, I’m torn, as I don’t think I quite agree on what Freedom 0 consists of.

On the Free Software Foundation website, Freedom 0 is defined as

The freedom to run the program as you wish, for any purpose (freedom 0).

At the same time, a whole lot of fundamentalists seem to me to try their best to not allow the users to run the programs as they wish. We wouldn’t, otherwise, be having purity tests and crusade against closed-source components that users may want to actually use, and we wouldn’t have absurdist solutions for firmware, that involve showing binary blobs under the carpet, and just not letting the user ever update them.

The way in which I disagree with both formulation and interpretation of this statement, is that I think that software should, first of all, be usable for its intended purpose, and that software that isn’t… isn’t really worth discussing about.

In the case of Free Software, I think that, before any licensing and usage concern, we should be concerned about providing value to the users. As I said, not a novel idea for me. This means that software that that is built with the sole idea of showing Free Software supremacy, is not useful software for me to focus on. Operating systems, smart home solutions, hardware, … all of these fields need users to have long-term support, and those users will not be developers, or even contributors!

So with this in mind, I want to take a page out of the literal Susan Calman book, and talk about Kind Software, as an extension of Free Software. Kind Software is software that is meant for the user to use and to keep the user as its first priority. I know that a number of people would make this to be a perfect overlap and contrast, considering all Free Software as Kind Software, and all proprietary software as not Kind Software… but the truth is that it is significantly more nuanced than that.

Even keeping aside the amount of Free Software that is “dual-use” and that can be used by attackers just as much as defenders – and that might sometimes have a bit too much of a bias towards the attackers – you don’t need to look much further than the old joke about how “Unix is user friendly, it’s just very selective of who its friends are”. Kind software wouldn’t be selective — the user use-cases are paramount, any software that would be saying “You don’t do that with {software}, because it is against my philosophy” would by my definition not be Kind Software.

Although, obviously, this brings us back to the paradox of tolerance, which is why I don’t think I’d be able to lead a Kind Software movement, and why I don’t think that the solution to any of this has to do with licenses, or codes of ethics. After all, different people have different ideas of what is ethical and what isn’t, and sometimes you need to make a choice by yourself, without fighting an uphill battle so that everyone who doesn’t agree with you is labelled an enemy. (Though, if you think that nazis are okay people, you’re definitely not a friend of mine.)

What this tells me that I can define my own rules for what I consider “Kind Software”, but I doubt I can define them for the general case. And in my case, I have a mixture of Free Software and proprietary software in the list, because I would always select the tools that first get their job done, and second are flexible enough for people to adapt. Free Software makes the latter much easier, but too often is the case that the former is not the case, and the value of a software that can be easily modified, but doesn’t do what I need is… none.

There is more than that of course. I have ranted before about the ethical concerns with selling routers, and I’ve actually been vocal as a supporter for law requiring businesses to have their network equipment set up by a professional — although with a matching relaxation of the requirements to be considered a professional. So while I am a strong believer in the importance of OpenWRT I do think that trying to suggest it as a solution for general final users is unkind, at least for the moment.

On the other side of the room, Home Assistant to me looks like a great project, and a kind one to it. The way they handled the recent security issues (in January — pretty much just happened as I’m writing this) is definitely part of it: warned users wherever they could, and made sure to introduce safeties to make sure that further bugs in components that they don’t even support wouldn’t introduce this very same problem again. And most importantly, they are not there to tell you how to use your gadgets, they are there to integrate with whatever is possible to.

This is, by the way, the main part of the reason why I don’t like self-hosting solutions, and why I would categorically consider software needing to be self-hosted as unkind: it puts the burden of it not being abused on the users themselves, and unless their job is literally to look after hosted services, it’s unlikely that they will be doing a good job — and that’s without discussing the fact that they’d likely be using time that they meant to be spending on something else just to keep the system running.

And speaking of proprietary, yet kind, software — I have already spoken about Abbott’s LibreLink and the fact that my diabetes team at the hospital is able to observe my glucose levels remotely, in pretty much real-time. This is obviously a proprietary solution, and not a bug-free one at that, and I’m also upset they locked it in, but it is also a kind one: the various tools that don’t seem to care about the expiration dates, that think that they can provide a good answer without knowing the full extent of the algorithm involved, and that insist it’s okay to not wait for the science… well, they don’t sound kind to me: they not just allow access to personal data, which would be okay, but they present data that might not be right for people to take clinical decisions and… yeah that’s just scary to me.

Again, that’s a personal view on this. I know that some people are happy to try open-source medical device designs on themselves, or be part of multi-year studies for those. But I don’t think it’s kind to expect others to do the same.

Unfortunately, I don’t really have a good call to action here, except to tell Free Software developers to remember to be kind as well. And to think of the implications of the software they write. Sometimes, just because we’re able to throw something out there, doesn’t mean it’s the kind thing to do so.

Video: unpaper with Meson — pytesting our way out

Another couple of hours spent looking at porting Unpaper to Meson, this time working on porting the tests from the horrible mix of Automake, shell, and C to a more manageable Python testing framework.

I’ll write up a more complex debrief of this series of porting videos, as there’s plenty to unpack out of all of them, and some might be good feedback for the Meson folks to see if there’s any chance to make a few things easier — or at least make it easy to find the right solution.

Also, you can see a new avatar in the corner to make the videos easier to recognize 😀 — the art is from the awesome Tamtamdi, commissioned by my wife as a birthday present last year. It’s the best present ever, and it seemed a perfect fit for the streams.

And as I said at the end, the opening getting ready photo stars Pesto from Our Super Adventure, and you probably saw it already when I posted about Sarah and Stef’s awesome comics.

As a reminder, I have been trying to stream a couple of hours of Free Software every so often on my Twitch channel — and then archiving these on YouTube. If you’re interested in being notified about these happening, I’m usually announcing them with a few hours to spare (rarely more than that due to logistics) on Twitter or Facebook.

Progress Logging and Results Logging

There is one thing that my role as software mechanic seems to get me attracted to, and that’s the importance of logging information. Logging is one of those areas that tend to bring up opinions, and with the idea of making this into a wider area of observability, it brought on entire businesses (shout out to friends Honeycomb.io). But even in smaller realities, I found myself caring about logging, setting up complex routing with metalog, or hoping to get a way to access Apache logs in a structured format.

Obviously, when talking about logging in bubbles, there’s a lot more to consider than just which software you send the logs to — even smaller companies nowadays need to be careful with PII, since GDPR makes most data toxic to handle. I can definitely tell you that some of the analysis I used to do for User-Agent filtering would not pass muster for a company at the time of GDPR — in a very similar fashion as the pizzeria CRM.

But leaving aside the whole complicated legal landscape, there’s a distinction in logs that I have not seen well understood by engineers – no matter where they are coming from – and that is the difference between what I call progress logging and results logging. I say that I call them this way, because I found a number of other different categorizations of logs, but none that matches my thoughts on the matter, and I needed to give it names.

Distinctions that I did hear people talk about are more like “debug logs” versus “request logs”, or “text logs” versus “binary logs”. But this all feels like it’s mixing media and message, in too many cases — as I said in my post about Apache request logs, I would love for structured (even binary would do) request logs, which are currently “simple” text logs.

Indeed, Apache (and any other server) request logs to me fit neatly in the category of results logging. They describe what happened when an action completed: the result of the HTTP request includes some information of the request, and some information of the response. It provides a result of what happened.

If you were to oversimplify this, you could log each full request and each full response, and call that results logging: a certain request resulted in a certain response. But I would expect that there is a lot more information available on the server, which does not otherwise make it to the response, for many different reasons (e.g. it might be information that the requestor is not meant to find out, or simply doesn’t have to know, and the response is meant to be as small as possible). In the case of an HTTP request to a server that act as a reverse proxy, the requestor should not be told about which backend the request was handled by — but it would be an useful thing to log as part of the result.

When looking at the practicality of implementing results logging, servers tend to accrue the information needed for generating the result logs in data structures that are kept around throughout the request (or whatever other process) lifetime, and then extract all of the required information from them at the time of generating the log.

This does mean that if the server terminates (either because it’s killed, the power goes off, or the request caused it to crash), and the result is not provided, then you don’t get a log about the request happening at all — this is the dirty secret of Apache request logs (and many other servers): they’re called request logs but they actually log responses. There are ways around this, by writing parts of the results logs as they are identified – this helps both in terms of persistence and in terms of memory usage (if you’re storing something in memory just because you should be logging it later) but that ends up getting much closer to the concept of tracing.

Progress logs, instead, are closer to what is often called shotgun debugging or printf() debugging. They are log statement emitted as the code goes through them, and they are usually free-form for the developer writing the code. This is what you get with libraries such as Python’s logging, and can assume more or less structured form depending on a number of factors. For instance, you can have a single formatted string with maybe the source file and line, or you may have a full backtrace of where the log event happened and what the local variables in each of the function calls were. What usually make you choose between the two is cost, and signal-to-noise ratio, of course.

For example, Apache’s mod_rewrite has a comprehensive progress log that provides a lot of details of how each rewrite is executed, but if you turn that on, it’ll fill up your server’s filesystem fairly quickly, and it will also make the webserver performance go down the drain. You do want this log, if you’re debugging an issue, but you most definitely don’t want it for every request. The same works for results logs — take for instance ModSecurity: when I used to maintain my ruleset, I wouldn’t audit-log every request, but I had a special rule that, if a certain header was provided in the request, would turn on audit-logging. This allowed me to identify problems when I was debugging a new possible rule.

Unfortunately, my experience straddling open-source development and industry bubbles means I don’t have overall good hopes for an easy way to implement logging “correctly”. Both because correctly is subjective, and because I really haven’t found a good way to do this that scales all the way from a simple tool like my pdfrename to a complex Cloud-based solution. Indeed , while the former would generally be caring less about structured logs and request tracing, a Cloud software like my planned-and-never-implemented Tanuga would get a significant benefit from using OpenTelemetry to connect feed fetching and rendering.

Flexible and configurable logging libraries, such as are available for Python, Ruby, Erlang, and many more, provide a good “starting point” but by experience they don’t scale well between in and out of an organization or unit. It’s a combination of problems similar to the schema issue and the RPC issue: within an organization you can build a convention of what you expect logs to be, and you can pay the cost of updating the configuration for all sorts of tools to do the right thing, but if you’re an end user, that’s unlikely — besides, sometimes that’s untested.

So it makes sense that, up to this day, we still have a significant reliance on “simple”, unstructured text logs. They are the one universally accessible way to provide information to users. But I would argue that we should be better off to build an ecosystem of pluggable, configurable backends, where the same tool, without being recompiled or edited, can be made to output simple text on the standard error stream, or to a more structured event log. Unfortunately, judging by how the FLOSS world took the idea of standardizing services’ behaviours with systemd, I doubt that’s going to happen any time soon in the wide world… but you can probably get away with it in big enough organizations that control what they run.

Also, for a bit of fun related tidbit, verbose (maybe excessively so) progress logging is what made my reverse engineering the OneTouch Verio so easy: on Windows the standard error is not usually readable… unless you run the application through a debugger. So once I did that, I could see every single part of the code as it processed the requests and responses for the device. Now, you could think that just hiding the logs by default, without documenting the flag to turn them on would be enough — but as it turns out, as long as the logging calls are built into a binary, it’s not too hard to understand them while reverse engineering.

What this is meant to say is that, just because easy access to logs is a great feature for open source tools, and for most internal tools in companies and other institutions, the same cannot be said for proprietary software: indeed, the ability to obfuscate logs, or even generate “encrypted” logs, is something that proprietary software (and hardware) thrive on: it makes it harder to reverse engineer. So it’s no surprise if logs are a complicated landscape with requirements that are not only divergent, but at times opposite, between different stakeholders.

Video: unpaper with Meson — From DocBook to ReStructured Text

I’m breaking the post-on-Tuesday routine to share the YouTube-uploaded copy of the stream I had yesterday on Twitch. It’s the second part of the Unpaper conversion to Meson, which is basically me spending two hours faffing around Meson and Sphinx to update how the manual page for Unpaper is generated.

I’m considering trying to keep up with having a bit of streaming every weekend just to make sure I get myself some time to work on Free Software. If you think this is interesting do let me know, as it definitely helps with motivations, to know I’m not just spending time that would otherwise be spent playing Fallout 76.

unpaper: re-designing an interface

This is going to be part of a series of post that will appear over the next few months with plans, but likely no progress, to move unpaper forward. I have picked up unpaper many years ago, and I’ve ran with it for a while, but beside a general “code health” pass over it, and back and forth on how to parse images, I’ve not managed to move the project forward significantly at all. And in the spirit of what I wrote earlier, I would like to see someone else pick up the project. It’s the reason why I create an organization on GitHub to keep the repository in.

For those who are not aware, unpaper is a project I did not start — it was originally written by Jens Gulden, who I understand worked on its algorithms as part of university. It’s a tool that processes scanned images of text pages, to make them easier to OCR them, and it’s often used as a “processing step” for document processing tools, including my own.

While the tool works… acceptably well… it does have a number of issues that always made me feel fairly uneasy. For instance, the command line flags are far from standard, and can’t be implemented with a parser library, relying instead on a lot of custom parsing, and including a very long and complicated man page.

There’s also been a few requests of moving the implementation to a shared library that could be used directly, but I don’t feel like it’s worth the hassle, because the current implementation is not really thread-safe, and that would be a significant rework to make it so.

So I have been having a bit of a thought out about it. The first problem is that re-designing the command line interface would mean breaking all of the programmatic users, so it’s not an easy decision to take Then there’s been something else that I learnt about that made me realize I think I know how to solve this, although it’s not going to be easy.

If you’ve been working exclusively on Linux and Unix-like systems, and still shy away from learning about what Microsoft is doing (which, to me, is a mistake), you might have missed PowerShell and its structured objects. To over-simplify, PowerShell piping doesn’t just pipe text from one command to another, but structured objects that are kept structured in and out.

While PowerShell is available for Linux nowadays, I do not think that tying unpaper to it is a sensible option, so I’m not even suggesting that. But I also found out that the ip command (from iproute2) has recently received a -J option, which, instead of printing the usual complex mix of parsable and barely human readable output, generates a JSON document with the same information. This makes it much easier to extract the information you need, particularly with a tool like jq available, that allows “massaging” the data on the command line easily. I have actually used this “trick” at work recently. It’s a very similar idea to RPC, but with a discrete binary.

So with this knowledge in my head, I have a fairly clear idea of what I would like to have as an interface for a future unpaper.

First of all, it should be two separate command line tools — they may both be written in C, or the first one might be written in Python or any other language. The job of this language-flexible tool is to be the new unpaper command line executable. It should accept exactly the same command line interface of the current binary, but implement none of the algorithm or transformation logic.

The other tool should be written in C, because it should just contain all the current processing code. But instead of having to do complex parsing of the command line interface, it should instead read on the standard input a JSON document providing all of the parameters for the “job”.

Similarly, there’s some change needed to the output of the programs. Some of the information, particularly debugging, that is currently printed on the stderr stream should stay exactly where it is, but all of the standard output, well, I think it makes significantly more sense to have another JSON document from the processing tool, and convert that to human-readable form in the interface.

Now, with a proper documentation of the JSON schema, it means that the software using unpaper as a processing step can just build their job document, and skip the “human” interface. It would even make it much easier to write extensions in Python, Ruby, and any other language, as it would allow exposing a job configuration generator following the language’s own style.

Someone might wonder why I’m talking about JSON in particular — there’s dozens of different structured data formats that could be used, including protocol buffers. As I said a number of months ago, the important part in a data format is its schema, so the actual format wouldn’t be much of a choice. But on the other hand, JSON is a very flexible format that has good implementations in most languages, including C (which is important, since the unpaper algorithms are implemented in C, and – short of moving to Rust – I don’t want to change language).

But there’s something even more important than the language support, which I already noted above: jq. This is an astounding work of engineering, making it so much easier to handle JSON documents, particularly inline between programs. And that is the killer reason to use JSON for me. Because that gives even more flexibility to an interface that, for the longest time, felt too rigid to me.

So if you’re interested to contribute to an open source project, with no particular timeline pressure, and you’re comfortable with writing C — please reach out, whether it is to ask questions for clarification, or with a pull request to implement this idea altogether.

And don’t forget, there’s still the Meson conversion project which I also discussed previously. For that one, some of the tasks are not even C projects! It needs someone to take the time to rewrite the man page in Sphinx, and also someone to rewrite the testing setup to be driven by Python, rather than the current mess of Automake and custom shell scripts.

NewsBlur Review

One of the very, very common refrain I hear in my circles, probably because my circles are full of ex-users of it, and at the same time of Googlers and Xooglers, is that the Internet changed when Google Reader was shut down, and that we would never be able to come back. This is something that I don’t quite buy out right — Google Reader, like most of the similar tools, was used only by a subset of the general population, while other tools, such as social networks, started being widely used right around the same time.

But in the amount of moaning about Google Reader not existing anymore, I rarely hear enough willingness to look for alternatives. Sure there was a huge noise about options back then, which I previously called the “Google Reader Exodus“, but I rarely hear of much else. I see tweets going by of people wishing that Reader still existed, but I don’t think I have seen many willing to go out of their way to do something about it.

Important aside here: while I did work at Google when Reader was shut down in effect, the plan was announced in-between me signing my contract and my start date. And obviously it was not something that was decided there and then, but rather a long-term decision taken who knows how long before. So while I was at Google for the “funeral”, I had no saying, or knowledge, of any of it.

Well, the good news is that NewsBlur, which I have started using right before the Reader shut down, is still my favourite tool for this, it’s open source, and it has a hosted service that costs a reasonable $36/year. And it doesn’t even have a referral program, so if you had any doubt of me shilling, you can vacate it now.

So first of all, NewsBlur has enough options for layout that look so much like Google Reader “of back then” — before Google+ and before losing the “Shared Stories” feature. Indeed, it supports both its own list of followers/following, and global sharing of stories on the platform. And you don’t even need to be an user to follow what I share on it, since it also automatically creates a blurblog, which you can subscribe to with whatever you want.

I have in the past used IFTTT to integrate further features, including saving stories to Pocket, and sharing stories on Twitter, Facebook, and LinkedIn. Unfortunately while NewsBlur has great integration, IFTTT is now a $4/month service, which does not have nearly enough features for me to consider subscribing to, sorry. So for now I’m talking about direct features only.

In addition to the sharing features, NewsBlur has what is for me one of the killer features: the “Intelligence Trainer”. Which is not any type of machine learning system, but rather a way for you to tell NewsBlur to hide, or highlight, certain content. This is very similar to a feature I would have wanted twelve years ago: filtering. Indeed, this allowed me to hide my own posts from Gentoo Universe – back when I was involved in the project – and to only read Matthew’s blog posts in one of the many Planets he’s syndicated, like I wanted. But there’s much more to it.

I have used this up to this day to hide repetitive posts (e.g. status updates for certain projects being aggregated together with blogs), to stop reading authors that didn’t interest me, or wrote in languages I couldn’t read. But I also used the “highlighting” feature to know when a friend posted on another Planet, or to get information about new releases or tours from metal bands I followed, through some of the dedicated websites’ feeds.

But where this becomes extremely interesting is when you combine it with another feature that nowadays I couldn’t go without, particularly as so much content that used to be available as blogs, sites, and feeds is becoming newsletters: it’s the ability to receive email newsletters and turn them into a feed. I do this for quite a few of them: the Adafruit Python for Microcontrollers newsletter (which admittedly is also available through their blog), the new tickets alerts from a bunch of different venues (admittedly not very useful this year), Tor.com, and Patreon.

And since the intelligence trainer does not need to have tags or authors to go along, but can match a substring in the title (subject), this makes it an awesome tool to filter out certain particular messages from a newsletter. For instance, while I do support a number of creators on Patreon, a few of them share all their public videos as updates — I don’t need to see those in the Patreon feed, as I get them directly at source, so I can hide those particular series from the Patreon feed for myself. And instead, while I can wait for most of the Tor.com releases, I do want to know quickly if they are giving away a free book, or if there’s a new release from John Scalzi that I missed. And again, the highlighting helps me there: it makes a green counter appear next to the “feed”, that tells me there’s something I want to look at sooner, rather than later.

As I said the intelligence trainer doesn’t have to use tags — but it can use them if they are there at all. So for instance for this very blog, if I were to post something in Italian and you wouldn’t be able to read it, you could train NewsBlur to hide posts in Italian. Or if you think my opinions are useless, you can just hide those, too.

But this is not where it ends. Beside having an awesome implementation of HTTP, which supports all bandwidth-saving optimizations I know of, NewsBlur thinks about the user a lot more than Google Reader would have. Whenever you decide to do some spring cleaning of your subscription, NewsBlur will send you by email an OPML file with all of your subscribed feed before you made the first change (for the day, I think). That way you never risk deleting a subscription without having a way to find it agian. And it supports muting sites, so you don’t need to unsubscribe not to get a high count of unread posts of, say, a frequent flyers’ blog during a pandemic.

Plus it’s extremely tweakable and customizable — you can choose to see the stories as they appear in the feed, or load into a frame the original website linked by the story, or try to extract the story content from the linked site (the “reader mode”).

Overall, I can only suggest to those who keep complaining about Google Reader’s demies, that it’s always a good time to join NewsBlur instead.

Software Defined Remote Control

A number of months ago I spoke about trying to control a number of TV features in Python. While I did manage to get some of the adapter boards that I thought I would use printed, I hadn’t had the time to go and work on the software to control this before we started looking for a new place, which meant I shelved the project until we could get to the new place, and once we got there it was a matter of getting settled down, and then, … you got the idea.

As it turns out, I had one week free at the end of November — my employer decided to give three extra days on the (US) Thanksgiving week, and since my birthday was at the end of the week, I decided to take the remaining two days off myself to make it a nice nine days contiguous off. Perfect timeframe to go and hack on some projects such as that.

Also, one thing changed significantly since the time I started thinking about this: I started using Home Assistant. And while it started mostly as a way for me to keep an eye on the temperature of the winter garden, I found that with a bit of configuration, and a pull request, changing the input on my receiver with it was actually easier than using the remote control and trying to remember which input was mapped to what.

That gave me finally the idea of how to implement my TV input switch tree: expose it as one or more media players in Home Assistant!

Bad (Hardware) Choices

Unfortunately, as soon as I went to start implementing the switching code, I found out that I had made a big mistake in my assumptions: the Adafruit FT232H breakout board does not support PWM outputs, including the general time-based pulsing (without a carrier frequency). Indeed, while the Blinka library can technically support some of the features, it seems like none of the Linux-running platforms would be able to manage that. So there goes my option of just using a computer to drive the “fake remote” outputs directly. Well, at least without rewriting it in some other language and find a different way to send that kind of signals.

I looked around for a few more options, but all of it ended up being some compromise: MicroPython doesn’t have a very usable PulseOut library as far as I could tell; Arduino-based libraries don’t seem to allow two outputs to happen at roughly the same time; and as I’m sure I already noted in passing, CircuitPython lacks a good “secondary channel” to be instructed from a computer (the serial interface is shared with the REPL control, and the HID is gadget-to-host only).

After poking around a few options and very briefly considering writing my own C version on an ATmega, I decided to just go for the path of least resistance, and go back to CircuitPython, and try to work with the serial interface and its “standard input” to the software.

The problem with doing that is that the Ctrl-C command is intended to interrupt the command, and that means you cannot send the byte 0x03 un-escaped. At the end I thought about it, and decided that CircuitPython is powerful enough that just sending the commands in ASCII wouldn’t be an issue. So I decided to write a simplistic Flask app that would take a request over HTTP and send the command via the serial port. It worked, sort of. Sometimes while debugging I would end up locking the device (a Trinket M0) in the REPL, and that meant the commands wouldn’t be sent.

The solution I came up with was to reset the board every time I started the app, by sending Ctrl-C and Ctrl-D (0x03, 0x04) to force the board to reset. It worked much better.

Not-Quite-Remote Controlled HDMI Switch

After that worked, the problem was ensuring that the commands sent actually worked. The first component I needed to send the commands to was the HDMI switch. It’s a no-brand AliExpress-special HDMI switch. It has one very nice feature for what I need to do right now. It obviously has an infrared remote control – one of those thin, plasticky domes one – but it particularly has the receiver for it on a cord, which is connected with a pretty much standard 3.5mm “audio jack”.

This is not uncommon. Randomly searching Amazon or AliExpress for “HDMI switch remote” can find you a number of different, newer switches that use the same remote receiver, or something very similar to it. I’m not sure if the receivers are compatible between each other, but the whole idea is the same: by using a separate receiver, you can stick the HDMI switch behind a TV, for instance, and just make the receiver poke from below. And most receivers appear to be just a dome-encased TSOP17xx receiver, which is a 3-pin IC, which works great for a TRS.

When trying this out, I found that what I could do would be to use a Y-cable to allow both the original receiver and my board to send signals to the switch — at which point, I can send in my own pulses, without even bothering with the carrier frequency (refer to the previous post for details on this, it’s long). The way the signal is sent, the pulses need to ground the “signal” line (that is usually at 5V); to avoid messing up the different supplies, I paired it on an opto-coupler, since they are shockingly cheap when buying them in bulk.

But now that I tried setting this up with an input selection, I found myself not able to get the switch to see my signal. This turned out to require an annoying physical debugging session with the Saleae and my TRRS-to-Saleae adapter (that I have still not released, sorry folks!), which showed I was a bit off on the timing of the NEC protocol the switch used for the remote control. This is now fixed in the pysirc library that generates the pulses.

Once I got the input selector working for the switch with the Flask app, I turned to Home Assistant and added a custom component that exposes the switch as a “media_player” platform. In a constant state of “Idle” (since it doesn’t have a concept of on or off), it allowed me and my wife to change the input while seeing the names of the devices, without hunting for the tiny remote, and without having to dance around to be seen by the receiver. It was already a huge improvement.

But it wasn’t quite enough where I wanted it to be. In particular, when our family calls on Messenger, we would like to be able to just turn on the TV selected to the right input. While this was partially possible (Google Assistant can turn on a TV with a Chromecast), and we could have tried wiring up the Nabu Casa integration to select the input of the HDMI switch, it would have not worked right if the last thing we used the TV for was the Nintendo Switch (not to be confused with the HDMI switch) or for Kodi — those are connected via a Yamaha receiver, on a different input of the TV set!

Enter Sony

But again, this was supposed to be working — the adapter board included a connection for an infrared LED, and that should have worked to send out the Sony SIRC commands. Well, except it didn’t, and that turned out to be another wild goose chase.

First, I was afraid that when I fixed the NEC timing I broke the SIRC ones — but no. To confirm this, and to make the rest of my integration easier, I took the Feather M4 to which I hard-soldered a Sony-compatible IR LED, and wrote what is the eponymous software defined remote control: a CircuitPython program that includes a few useful commands, and abstractions, to control a Sony device. For… reasons, I have added VCR as the only option beside TV; if you happen to have a Bluray player by Sony, and you want to figure out which device ID it uses, please feel free.

It might sound silly, but I remember seeing a research paper in UX from the ’90s of using gesture recognitions on a touchpad on a remote control to allow more compact remote controls. Well, if you wanted, you could easily make this CircuitPython example into a touchscreen remote control for any Sony device, as long as you can find all the right device IDs, and hard code a bunch of additional commands.

So, once I knew that at least on the software side I was perfectly capable of control the Sony TV, I had to go and do more hardware debugging, with the Saleae, but this time with the probes directly on the breadboard, as I had no TRS cable to connect to. And that was… a lot of work, to rewire stuff and try.

The first problem was that the carrier frequency was totally off. The SIRC protocol specifies a 40kHz carrier frequency, which is supposedly easier to generate than the 38kHz used by NEC and others, but somehow the Saleae was recording it as a very variable frequency that oscillated between 37kHz and 41kHZ. So I was afraid that trying to run two PWM outputs on the Trinket M0 was a bad idea, even if one of them was set to nought hertz — as I said, the HDMI switch didn’t need a carrier frequency.

I did toy briefly with the idea of generating the 40kHz carrier wave separately, and just gating it to the same type of signal I used for the HDMI switch. Supposedly, 40kHz generators are easy, but at least for the circuits I found at first glance, it requires a part (640kHz resonator) that is nearly impossible to find in 2020. Probably fell out of use. But as it turn out it wouldn’t have helped.

Instead, I took another Feather. Since I ran out of M4, except for the one I hardwired already an IR LED to, I instead pulled up the nRF52840 that I bought and barely played with. This should have been plenty capable to give me a clean 40kHz signal and it indeed was.

At that point I noticed another problem, though: I totally screwed up the adapter board. In my Feather M4, the IR LED was connected directly between 3V and the transistor switching it. A bit out of spec, but not uncommon given that it’s flashed for very brief impulses. On the other hand when I designed the adapter, I connected it to the 5V rail. Oops, that’s not what I was meant to be doing! And I did indeed burn out the IR LED with it. So I had to solder a new one on the cable.

Once I fixed that, I found myself hitting another issue: I could now turn on and off the TV with my app, but the switch stopped responding to commands either from the app or from the original remote! Another round of Saleae (that’s probably one of my favourite tools — yes I splurged when I bought it, but it’s turning out to be an awesome tool to have around, after all), and I found that the signal line was being held low — because the output pin is stuck high…

I have not tried debugging this further yet — I can probably reproduce this without my whole TV setup, so I should do that soonish. It seems like opening both lines for PWM output causes some conflicts, and one or the other end up not actually working. What I solved this with was only allowing one command before restarting the Feather. It meant taking longer to complete the commands, but it allowed me to continue with my life without further pain.

One small note here: since I wasn’t sure how Flask concurrency would interact with accessing a serial port, I decided to try something a bit out of the ordinary, and set up the access to the Feather via an Actor using pykka. It basically means leaving one thread to have direct access to the serial port, and queue commands as messages to it. It seems to be working fine.

Wrapping It All Up

Once the app was able to send arbitrary commands to the TV via infrared, as well as changing the input of the HDMI, I extended the Home Assistant integration to include the TV as a “media_player” entity as well. The commands I implemented were Power On and Off (discrete, rather than toggle, which means I can send a “Power On” to the TV when it’s already on and not bother it), and discrete source selection for the three sources we actually use (HDMI switch, Receiver, Commodore 64). There would be a lot more commands I could theoretically send, including volume control, but I can already access those via the receiver, so there’s no good reason to.

After that it was a matter of scripting some more complicated acts: direct selection of Portal, Chromecast, Kodi, and Nintendo Switch (which are the four things we use the most). This was easy at that point: turn on the TV (whether it was on or not), select the right input on either the receiver or the switch, then select the right input ion the TV. The reason why the order seems a bit strange is that it takes a few seconds for the TV to receive commands after turning on, but by doing it this way we can switch between Chromecast and Portal, or Nintendo Switch and Kodi, in pretty much no time.

And after that worked, we decided the $5/month to Nabu Casa were worth it, because that allows us to ask Alexa or Google Assistant to select the input for us, too.

Eventually, this lead me to replace Google’s “Turn off the TV” command in our nightly routine to trigger a Home Assistant script, too. Previously, it would issue the command to the Chromecast, routing through the whole Google cloud services between the device that took the request and the Chromecast. And then the Chromecast would be sending the CEC command to power off… except that it wouldn’t reach the receiver, which would stay on for another two hours until it finally decided it was time to turn off.

With the new setup, Google is triggering the Home Assistant script, and appears to do that asynchronously. Then Home Assistant sends the request to my app, that then sends it to the Feather, that sends the power off to the TV… which is also read by the receiver. I didn’t even need to send the power off command to the receiver itself!

All in all, the setup is satisfying.

What remains to be done is to try exposing a “Media Player” to Google Home, that is not actually any of the three “media_player” entities I have, but is a composite of them. That way, I could actually just expose the different input trees as discrete inputs to Google, and include the whole play, pause, and volume control that is currently missing from the voice controls. But that can wait.

Instead, I should probably get going at designing a new board to replace the breadboard mess I’m using right now. It’s hidden away enough that it’s not in our face (unlike the Birch Books experiments), but I would still like having a more… clean setup. And speaking of that, I really would love if someone already contributed an Adafruit Feather component for EAGLE, providing the space for soldering in the headers, but keeping the design referencing the actual lines as defined in it.

Senior Engineering: Open The Door, Move Away

As part of my change of bubble this year, I officially gained the title of “Senior” Engineer. Which made me take the whole “seniority” aspect of the job with more seriousness than I did before. Not because I’m aiming at running up the ladder of seniority, but because I feel it’s part of the due diligence of my job.

I have had very good examples in front of me for most of my career — and a few not great ones, if I am to be honest. And so I’ve been trying to formulate my own take of a senior engineer based on these. You may have noticed me talking about adjacent topics in my “work philosophy” tag. I also have been comparing this in my head with my contributions to Free Software, and in particular to Gentoo Linux.

I have retired from Gentoo Linux a few years ago, but realistically, I’ve stopped being actively involved in 2013, after joining the previous bubble. Part of it was a problem with contributing, part of it was a lack of time, and part of it was having the feeling that something was off. I start to feel I have a better impression now of what it is, and it relates to that seniority that I’m reflecting on.

You see, I worked on Gentoo Linux a little longer than I worked at the previous bubble, and as such I could say that I became a “senior developer” by tenure, but I didn’t really gain the insight to become a “senior developer” in deeds, and this is haunting me because I feel it was a wasted opportunity, despite the fact that it taught me many of the things that I needed to be even barely successful in my current job.

It Begins Early

My best guess is that I started working on Gentoo Linux when I was fairly young and with pretty much no social experience. Which combined with the less-than-perfect work environment of the project, had me develop a number of bad habits that took a very long time to grow out of. That is not to say that age by itself is a significant factor in this — I still resent the remark from one of the other developers that not having kids would make me a worse lead. But I do think that if I didn’t grow up to stay by myself in my own world, maybe I would have been able to do a better job.

I know people my age and younger that became very effective leaders years ago — they’ve got the charisma and the energy to get people on board, and to have them all work for a common goal in their own way. I don’t feel like I ever managed that, and I think it’s because for the longest time, the only person who I had to convince to do something was… myself.

I grew up quite lonely — in elementary school, while I can stay I did have one friend, I didn’t really join other kids It’s a bit of a stereotype for the lonely geek, but I have been made fun since early on about my passion for computers, and for my dislike of soccer – I feel a psychiatrist would have a field day to figure out that and the relationship with my father – and I failed at going to church and Sunday school, which ones the only out-of-school mingling for most of the folks around.

Nearly thirty years later I can tell you that the individualism that I got out of this, while having given me a few headstarts in life when it comes to technical knowledge, it held me back long term on the people skill needed to herd the cats and multiply my impact. It’s not by chance that I wrote about teamwork and, without using the word, individualism.

Aside: I’m Jealous of Kids These Days

As an unrelated aside, this may be the reason why I don’t have such a negative view of social networks in general. It was something I was actually asked when I switched jobs, on what my impression of the current situation is… and my point rolls back to that: when I was growing up we didn’t have social networks, Internet was a luxury, and while, I guess, BBSes were already a thing, they would still have been too expensive for me to access. So it took me until I managed to get an Internet connection and discover Usenet.

I know there’s a long list of issues with all kind of social networks: privacy, polarisation, fake news, … But at the same time I’m glad that it makes it much more approachable for kids nowadays, who don’t fit with the crowd in their geographical proximity, to reach out to friendlier bunches. Of course it’s a double-edged sword as it also allows for bullies to bully more effectively… but I think that’s much more of a society-at-large problem.

The Environment Matters

Whether we’re talking about FLOSS projects, or different teams at work, the environment around an individual matter. That’s because the people around them will provide influence, both positive and negative. In my case, with hindsight, I feel I hanged around the wrong folks too long, in Gentoo Linux, and later on.

While a number of people I met on the project have exerted, again with hindsight, a good, positive influence in my way of approaching the world, I also can tell you now that there’s some “go-to behaviours” that go the wrong way. In particular, while I’ve always tended to be sarcastic and an iconoclast, I can tell you that in my tenure as a Gentoo Linux developer I crossed the line from “snarky” to “nasty” a lot of times.

And having learnt to avoid that, and keeping in check how close to that line I get, I also know that it is something connected to the environment around me. In my previous bubble, I once begged my director to let me change team despite having spent less than the two years I was expected to be on it. The reason? I caught myself becoming more and more snarky, getting close to that line. It wouldn’t have served either me or the company for me to stay in that environment.

Was it a problem with the team as a whole? Maybe, or maybe I just couldn’t fit into it. Or maybe it was a single individual that fouled the mood for many others. Donnie’s talk does not apply only to FLOSS projects, and The No Asshole Rule is still as relevant a book as ever in 2020. Just like in certain projects, I have seen teams in which certain areas were explicitly walked away from by the majority of the engineers, just to avoid having to deal with one or two people.

Another emergent behaviour with this is the “chosen intermediate person” — which is a dysfunction I have seen in multiple projects, and teams. When a limited subset of team members are used to “relate” to another individual either within, or outside, the team. I have been that individual in the first year of high school, with the chemistry teacher — we complained loudly about her being a bad teacher, but now I can say that she was probably a bigger expert in her field than most of the other chemistry teachers in the school, but she was terrible with people. Since I was just as bad, it seemed like I was the best interface with her, and when the class needed her approval to go on a fieldtrip, I was “volunteered” to be the person going.

I’ll get back later on a few more reasons why tolerating “brilliant but difficult to work with” people in a project or team is further unhealthy, but I want to make a few more points here, because this can be a contentious topic due to cultural differences. I have worked with a number of engineers in the past that would be described as assholes by some, and grumpy by others.

In general, I think it’s worth giving a benefit of the doubt to people, at first — but make sure that they are aware of it! Holding people to standards they are not aware of, and have no way to course-correct around, is not fair and will stir further trouble. And while some level of civility can be assumed, in my experience projects and teams that are heavily anglophones, tend to assume a lot more commonality in expectation than it’s fair to.

Stop Having Heroes

One of the widely known shorthands at the old bubble was “no heroes” — a reference to a slide deck from one of the senior engineers in my org on the importance of not relying on “heroes” looking after a service, a job, or a process. Individuals that will step in at any time of day and night to solve an issue, and demonstrate how they are indispensable for the service to run. The talk is significantly more nuanced than my summary right now, so take my words with a grain of salt of course.

While the talk is good, I have noticed a little too often the shorthand used to just tell people to stop doing what they think is the right thing, and leave rakes all around the place. So I have some additional nuances for it of my own, starting with the fact that I find it a very bad sign when a manager uses the shorthand with their own reports — that’s because one of my managers did exactly that, and I know that it doesn’t help. Calling up “no heroes” practice between engineers is generally fair game, and if you call up on your own contributions, that’s awesome, too! «This is the last time I’m fixing this, if nobody else prioritizes this, no heroes!»

On the other hand, when it’s my manager telling me to stop doing something and “let it break”, well… how does that help anyone? Yes, it’s in the best interest of the engineer (and possibly the company) for them not to be the hero that steps in, but why is this happening? Is the team relying on this heroism? Is the company relying on it? What’s the long-term plan to deal with that? Those are all questions that the manager should at least ask, rather than just tell the engineer to stop doing what they are doing!

I’ve been “the hero” a few times, both at work and in Gentoo Linux. It’s something I always have been ambivalent about. From one side, it feels good to be able to go and fix stuff yourself. From the other hand, it’s exhausting to feel like the one person holding up the whole fort. So yes, I totally agree that we shouldn’t have heroes holding up the fort. But since it still happens, it can’t be left just up to an individual to remember to step back at the right moment to avoid becoming a hero.

In Gentoo Linux, I feel the reason why we ended up with so many heroes was the lack of coordination between teams, and the lack of general integration — the individualism all over again. And it reminds me of a post from a former colleague about Debian, because some of the issues (very little mandated common process, too many different ways to do the same things) are the kind of “me before team” approaches that drive me up the wall, honestly.

As for my previous bubble, I think the answer I’m going to give is that the performance review project as I remember it (hopefully it changed in the meantime) should be held responsible for most of it, because of just a few words: go-to person. When looking at performance review as a checklist (which you’re told not to, but clearly a lot of people do), at least for my role, many of the levels included “being the go-to person”. Not a go-to person. Not a “subject matter expert” (which seems to be the preferred wording in my current bubble). But the go-to person.

From being the go-to person, to being the hero, and build up a cult of personality, the steps are not that far. And this is true in the workplace as well as in FLOSS projects — just think, and you probably can figure out a few projects that became synonymous with their maintainers, or authors.

Get Out of The Way

What I feel Gentoo Linux taught me, and in particular leaving Gentoo Linux taught me, is that the correct thing for a senior engineer to do is to know when to bow out. Or move onto a different project. Or maybe it’s not Gentoo Linux that taught me that.

But in general, I still think this is the most important lesson is to know how to open the door and get out of the way. And I mean it, that both parts are needed. It’s not just a matter of moving on when you feel like you’ve done your part — you need to be able to also open the door (and make sure it stays open) for the others to pass through it, as well. That means planning to get out of the way, not just disappearing.

This is something that I didn’t really do well when I left Gentoo Linux. I While I eventually did get out of the way, I didn’t really fully open the door. I started, and I’m proud of that, but I think I should have done this better. The blogs documenting how the Tinderbox worked, as well as the notes I left about things like the USE-based Ruby interpreter selection, seems to have been useful to have others pick up where i left… but not in a very seamless way.

I think I did this better when I left the previous bubble, by making sure all of the stuff I was working on had breadcrumbs for the next person to pick up. I have to say it did make me warm inside to receive a tweet, months after leaving, from a colleague announcing that the long-running deprecation project I’ve worked on was finally completed.

It’s not an easy task. I know a number of senior engineers who can’t give up their one project — I’ve been that person before, although as I said I haven’t really considered myself a “senior” engineer before. Part of it is wanting to be able to keep the project working exactly like I want it to, and part of it is feeling attached to the project and wanting to be the person grabbing the praise for it. But I have been letting go as much as I could of these in the past few years.

Indeed, while some projects thrive under benevolent dictators for life, teams at work don’t tend to work quite as well. Those dictators become gatekeepers, and the projects can end up stagnating. Why does this happen more at work than in FLOSS? I can only venture a guess: FLOSS is a matter of personal pride — and you can “show off” having worked on someone else’s project at any time, even though it might be more interesting to “fully make the project one’s own”. On the other hand, if you’re working at a big company, you may optimise working on projects where you can “own the impact” for the time you bring this up to performance review.

The Loadbearing Engineer

When senior engineers don’t move away after opening the door, they may become “loadbearing” — they may be the only person knowing how something works. Maybe not willingly, but someone will go “I don’t know, ask $them” whenever a question about a specific system comes by.

There’s also the risk that they may want to become loadbearing, to become irreplaceable, to build up job security. They may decide not to document the way certain process runs, the reason why certain decisions were made, or the requirements of certain interfaces. If you happen to want to do something without involving them, they’ll be waiting for you to fail, or maybe they’ll manage to stop you from breaking an important assumption in the system at the last moment. This is clearly unhealthy for the company, or project, and risky for the person involved, if they are found to not be quite as indispensable.

There’s plenty of already written on the topic of bus factor, which is what this fits into. My personal take on this is to make sure that those who become “loadbearing engineers” are made sure to be taking at least one long vacation a year. Make sure that they are unreachable unless something goes very wrong, as in, business destroying wrong. And make sure that they don’t just happen to mark themselves out of office, but still glued to their work phone and computer. And yes, I’m talking about what I did to myself a couple of times over my tenure at the previous bubble.

That is, more or less, what I did by leaving Gentoo as well — I’ve been holding the QA fort so long, that it was a given that no matter what was wrong, Flameeyes was there to save the day. But no, eventually I wasn’t, and someone else had to go and build a better, scalable alternative.

Some of This Applies to Projects, Too

I don’t mean it as “some of the issues with engineers apply to developers”. That’s a given. I mean that some of the problems happen to apply to the projects themselves.

Projects can become the de-facto sole choice for something, leaving every improvement behind, because nobody can approach them. But if something happens, and they are not updated further, it might just give it enough of a push that they can get replaced. This has happened to many FLOSS projects in the past, and it’s usually a symptom of a mostly healthy ecosystem.

We have seen how XFree86 becoming stale lead to Xorg being fired up, which in turn brought us a significant number of improvements, from the splitting apart of the big monolith, to XCB, to compositors, to Wayland. Apache OpenOffice is pretty much untouched for a long time, but that gave us LibreOffice. GCC having refused plugins for long enough put more wood behind Clang.

I know that not everybody would agree that the hardest problems in software engineering are people problems, but I honestly have that feeling at this point.

Computer-Aided Software Engineering

Fourteen years ago, fresh of translating Ian Sommerville’s Software Engineering (no, don’t buy it, I don’t find it worth it), and approaching the FLOSS community for the first time, I wrote a long article for the Italian edition of Linux Journal on Computer-Aided Software Engineering (CASE) tools. Recently, I’ve decided to post that article on the blog, since the original publisher is gone, and I thought it would be useful to just have it around. And because the OCR is not really reliable, I ended up having to retype a good chunk of it.

And that reminded me of how, despite me having been wrong a lot of times before, I still think some ideas stuck with me and I still find them valid. CASE is one of those, even though a lot of times we’re not really talking of the tools involved as CASE.

UML is the usual example of a CASE tool — it confuses a lot of people because the “language” part suggests it’s actually used to write programs, but that’s not what it is for: it is a way to represent similar concepts in similar ways, without having to re-explain the same iconography: sequence diagrams, component diagrams, entity-relationship diagrams standardise the way you express certain relationship and information. That’s what it is all about — and while you could draw all of those diagrams without any specific tool, with either LibreOffice Draw, or Inkscape, or Visio, specific tools for UML are meant to help (aid) you with the task.

My personal preferred tool for UML is Visual Paradigm, which is a closed-source, proprietary solution — I have not found a good open source toolkit that could replace it. PlantUML is an interesting option, but it doesn’t have nearly all the aid that I would expect form an actual UML CASE tool — you can’t properly express relationships between different components across diagrams, as you don’t have a library of components and models.

But setting UML aside, there’s a lot more that should fit into the CASE definition. Tools for code validation and review, which are some of my favourite things ever, are also aids to software engineering. And so are linters, formatters, and sanitizers. It’s easy to just call them “dev tools”, but I would argue that particularly when it comes to automating the code workflows, it makes sense to consider them CASE tools, and reduce the stigma attached to the concept of CASE, particularly in the more “trendy” startups and open source — where I still feel push backs at using UML, or auto-formatters, and integrated development environments.

Indeed, for most of these tools, they are already considered their own category: “developer productivity”. Which is not wrong, but it does reduce significantly the impact they have — it’s not just about developers, or coders. I like to say that Software Engineering is a teamwork practice, and not everybody on a Software Engineering team would be a coder — or a software engineer, even.

A proper repository of documents, kept up to date with the implementation, is not just useful for the developers that come later, and need to implement features that integrate with the already existing system. It’s useful for the SRE/Ops folks who are debugging something on fire, and are looking at the interaction between different components. It’s useful to the customer support folks who are being asked why only a particular type of requests are failing in one of the backends. It’s useful to the product managers to have clear which use cases are implemented for the service, and which components are involved in specific user journeys.

And similarly it extends for other type of tools — A code review tool that can enforce updates to the documentation. A dependency tracking system that can match known vulnerabilities. A documentation repository that allows full reviews. An issue tracker system that can identify who most recently changed code that affects the component an issue was filed on.

And from here you can see why I’m sceptical about single-issue tools being “good enough”. Without integration, these tools are only as useful as the time they save, and often that means they are “negative useful” — it takes time to set up the tools, to remember to run them, and to address their concern. Integrated tools instead can provide additional benefits that go beyond their immediate features.

Take a linter as an example: a good linter with low false positive rate is a great tool to make sure your code is well written. But if you have to manually run it, it’s likely that, in a collaborative project, only a few people will be running it after each change, slowing them down, while not making much of a difference for everyone else. It gets easier if the linter is integrated in the editor (or IDE), and even easier if it’s also integrated as part of code review – so those who are not using the same editor can still be advised by it – and it’s much better if it’s integrated with something like pre-commit to make it so the issues are fixed before the review is sent out.

And looking at all these pieces together, the integrations, and the user journeys, that is itself Software Engineering. FLOSS developers in general appears to have built a lot of components and tools that would allow building those integrations, but until recently I would have said that there’s been no real progress in making it proper software engineering. Nowadays, I’m happy to see that there is some progress, even as simple as EditorConfig, to avoid having to fight over which editors to support in a repository, and which ones not to.

Hopefully this type of tooling is not going to be relegated to textbooks in the future, and we’ll also be used to have a bunch of CASE tools in our toolbox, to make software… better.