unpaper: re-designing an interface

This is going to be part of a series of post that will appear over the next few months with plans, but likely no progress, to move unpaper forward. I have picked up unpaper many years ago, and I’ve ran with it for a while, but beside a general “code health” pass over it, and back and forth on how to parse images, I’ve not managed to move the project forward significantly at all. And in the spirit of what I wrote earlier, I would like to see someone else pick up the project. It’s the reason why I create an organization on GitHub to keep the repository in.

For those who are not aware, unpaper is a project I did not start — it was originally written by Jens Gulden, who I understand worked on its algorithms as part of university. It’s a tool that processes scanned images of text pages, to make them easier to OCR them, and it’s often used as a “processing step” for document processing tools, including my own.

While the tool works… acceptably well… it does have a number of issues that always made me feel fairly uneasy. For instance, the command line flags are far from standard, and can’t be implemented with a parser library, relying instead on a lot of custom parsing, and including a very long and complicated man page.

There’s also been a few requests of moving the implementation to a shared library that could be used directly, but I don’t feel like it’s worth the hassle, because the current implementation is not really thread-safe, and that would be a significant rework to make it so.

So I have been having a bit of a thought out about it. The first problem is that re-designing the command line interface would mean breaking all of the programmatic users, so it’s not an easy decision to take Then there’s been something else that I learnt about that made me realize I think I know how to solve this, although it’s not going to be easy.

If you’ve been working exclusively on Linux and Unix-like systems, and still shy away from learning about what Microsoft is doing (which, to me, is a mistake), you might have missed PowerShell and its structured objects. To over-simplify, PowerShell piping doesn’t just pipe text from one command to another, but structured objects that are kept structured in and out.

While PowerShell is available for Linux nowadays, I do not think that tying unpaper to it is a sensible option, so I’m not even suggesting that. But I also found out that the ip command (from iproute2) has recently received a -J option, which, instead of printing the usual complex mix of parsable and barely human readable output, generates a JSON document with the same information. This makes it much easier to extract the information you need, particularly with a tool like jq available, that allows “massaging” the data on the command line easily. I have actually used this “trick” at work recently. It’s a very similar idea to RPC, but with a discrete binary.

So with this knowledge in my head, I have a fairly clear idea of what I would like to have as an interface for a future unpaper.

First of all, it should be two separate command line tools — they may both be written in C, or the first one might be written in Python or any other language. The job of this language-flexible tool is to be the new unpaper command line executable. It should accept exactly the same command line interface of the current binary, but implement none of the algorithm or transformation logic.

The other tool should be written in C, because it should just contain all the current processing code. But instead of having to do complex parsing of the command line interface, it should instead read on the standard input a JSON document providing all of the parameters for the “job”.

Similarly, there’s some change needed to the output of the programs. Some of the information, particularly debugging, that is currently printed on the stderr stream should stay exactly where it is, but all of the standard output, well, I think it makes significantly more sense to have another JSON document from the processing tool, and convert that to human-readable form in the interface.

Now, with a proper documentation of the JSON schema, it means that the software using unpaper as a processing step can just build their job document, and skip the “human” interface. It would even make it much easier to write extensions in Python, Ruby, and any other language, as it would allow exposing a job configuration generator following the language’s own style.

Someone might wonder why I’m talking about JSON in particular — there’s dozens of different structured data formats that could be used, including protocol buffers. As I said a number of months ago, the important part in a data format is its schema, so the actual format wouldn’t be much of a choice. But on the other hand, JSON is a very flexible format that has good implementations in most languages, including C (which is important, since the unpaper algorithms are implemented in C, and – short of moving to Rust – I don’t want to change language).

But there’s something even more important than the language support, which I already noted above: jq. This is an astounding work of engineering, making it so much easier to handle JSON documents, particularly inline between programs. And that is the killer reason to use JSON for me. Because that gives even more flexibility to an interface that, for the longest time, felt too rigid to me.

So if you’re interested to contribute to an open source project, with no particular timeline pressure, and you’re comfortable with writing C — please reach out, whether it is to ask questions for clarification, or with a pull request to implement this idea altogether.

And don’t forget, there’s still the Meson conversion project which I also discussed previously. For that one, some of the tasks are not even C projects! It needs someone to take the time to rewrite the man page in Sphinx, and also someone to rewrite the testing setup to be driven by Python, rather than the current mess of Automake and custom shell scripts.

NewsBlur Review

One of the very, very common refrain I hear in my circles, probably because my circles are full of ex-users of it, and at the same time of Googlers and Xooglers, is that the Internet changed when Google Reader was shut down, and that we would never be able to come back. This is something that I don’t quite buy out right — Google Reader, like most of the similar tools, was used only by a subset of the general population, while other tools, such as social networks, started being widely used right around the same time.

But in the amount of moaning about Google Reader not existing anymore, I rarely hear enough willingness to look for alternatives. Sure there was a huge noise about options back then, which I previously called the “Google Reader Exodus“, but I rarely hear of much else. I see tweets going by of people wishing that Reader still existed, but I don’t think I have seen many willing to go out of their way to do something about it.

Important aside here: while I did work at Google when Reader was shut down in effect, the plan was announced in-between me signing my contract and my start date. And obviously it was not something that was decided there and then, but rather a long-term decision taken who knows how long before. So while I was at Google for the “funeral”, I had no saying, or knowledge, of any of it.

Well, the good news is that NewsBlur, which I have started using right before the Reader shut down, is still my favourite tool for this, it’s open source, and it has a hosted service that costs a reasonable $36/year. And it doesn’t even have a referral program, so if you had any doubt of me shilling, you can vacate it now.

So first of all, NewsBlur has enough options for layout that look so much like Google Reader “of back then” — before Google+ and before losing the “Shared Stories” feature. Indeed, it supports both its own list of followers/following, and global sharing of stories on the platform. And you don’t even need to be an user to follow what I share on it, since it also automatically creates a blurblog, which you can subscribe to with whatever you want.

I have in the past used IFTTT to integrate further features, including saving stories to Pocket, and sharing stories on Twitter, Facebook, and LinkedIn. Unfortunately while NewsBlur has great integration, IFTTT is now a $4/month service, which does not have nearly enough features for me to consider subscribing to, sorry. So for now I’m talking about direct features only.

In addition to the sharing features, NewsBlur has what is for me one of the killer features: the “Intelligence Trainer”. Which is not any type of machine learning system, but rather a way for you to tell NewsBlur to hide, or highlight, certain content. This is very similar to a feature I would have wanted twelve years ago: filtering. Indeed, this allowed me to hide my own posts from Gentoo Universe – back when I was involved in the project – and to only read Matthew’s blog posts in one of the many Planets he’s syndicated, like I wanted. But there’s much more to it.

I have used this up to this day to hide repetitive posts (e.g. status updates for certain projects being aggregated together with blogs), to stop reading authors that didn’t interest me, or wrote in languages I couldn’t read. But I also used the “highlighting” feature to know when a friend posted on another Planet, or to get information about new releases or tours from metal bands I followed, through some of the dedicated websites’ feeds.

But where this becomes extremely interesting is when you combine it with another feature that nowadays I couldn’t go without, particularly as so much content that used to be available as blogs, sites, and feeds is becoming newsletters: it’s the ability to receive email newsletters and turn them into a feed. I do this for quite a few of them: the Adafruit Python for Microcontrollers newsletter (which admittedly is also available through their blog), the new tickets alerts from a bunch of different venues (admittedly not very useful this year), Tor.com, and Patreon.

And since the intelligence trainer does not need to have tags or authors to go along, but can match a substring in the title (subject), this makes it an awesome tool to filter out certain particular messages from a newsletter. For instance, while I do support a number of creators on Patreon, a few of them share all their public videos as updates — I don’t need to see those in the Patreon feed, as I get them directly at source, so I can hide those particular series from the Patreon feed for myself. And instead, while I can wait for most of the Tor.com releases, I do want to know quickly if they are giving away a free book, or if there’s a new release from John Scalzi that I missed. And again, the highlighting helps me there: it makes a green counter appear next to the “feed”, that tells me there’s something I want to look at sooner, rather than later.

As I said the intelligence trainer doesn’t have to use tags — but it can use them if they are there at all. So for instance for this very blog, if I were to post something in Italian and you wouldn’t be able to read it, you could train NewsBlur to hide posts in Italian. Or if you think my opinions are useless, you can just hide those, too.

But this is not where it ends. Beside having an awesome implementation of HTTP, which supports all bandwidth-saving optimizations I know of, NewsBlur thinks about the user a lot more than Google Reader would have. Whenever you decide to do some spring cleaning of your subscription, NewsBlur will send you by email an OPML file with all of your subscribed feed before you made the first change (for the day, I think). That way you never risk deleting a subscription without having a way to find it agian. And it supports muting sites, so you don’t need to unsubscribe not to get a high count of unread posts of, say, a frequent flyers’ blog during a pandemic.

Plus it’s extremely tweakable and customizable — you can choose to see the stories as they appear in the feed, or load into a frame the original website linked by the story, or try to extract the story content from the linked site (the “reader mode”).

Overall, I can only suggest to those who keep complaining about Google Reader’s demies, that it’s always a good time to join NewsBlur instead.

Software Defined Remote Control

A number of months ago I spoke about trying to control a number of TV features in Python. While I did manage to get some of the adapter boards that I thought I would use printed, I hadn’t had the time to go and work on the software to control this before we started looking for a new place, which meant I shelved the project until we could get to the new place, and once we got there it was a matter of getting settled down, and then, … you got the idea.

As it turns out, I had one week free at the end of November — my employer decided to give three extra days on the (US) Thanksgiving week, and since my birthday was at the end of the week, I decided to take the remaining two days off myself to make it a nice nine days contiguous off. Perfect timeframe to go and hack on some projects such as that.

Also, one thing changed significantly since the time I started thinking about this: I started using Home Assistant. And while it started mostly as a way for me to keep an eye on the temperature of the winter garden, I found that with a bit of configuration, and a pull request, changing the input on my receiver with it was actually easier than using the remote control and trying to remember which input was mapped to what.

That gave me finally the idea of how to implement my TV input switch tree: expose it as one or more media players in Home Assistant!

Bad (Hardware) Choices

Unfortunately, as soon as I went to start implementing the switching code, I found out that I had made a big mistake in my assumptions: the Adafruit FT232H breakout board does not support PWM outputs, including the general time-based pulsing (without a carrier frequency). Indeed, while the Blinka library can technically support some of the features, it seems like none of the Linux-running platforms would be able to manage that. So there goes my option of just using a computer to drive the “fake remote” outputs directly. Well, at least without rewriting it in some other language and find a different way to send that kind of signals.

I looked around for a few more options, but all of it ended up being some compromise: MicroPython doesn’t have a very usable PulseOut library as far as I could tell; Arduino-based libraries don’t seem to allow two outputs to happen at roughly the same time; and as I’m sure I already noted in passing, CircuitPython lacks a good “secondary channel” to be instructed from a computer (the serial interface is shared with the REPL control, and the HID is gadget-to-host only).

After poking around a few options and very briefly considering writing my own C version on an ATmega, I decided to just go for the path of least resistance, and go back to CircuitPython, and try to work with the serial interface and its “standard input” to the software.

The problem with doing that is that the Ctrl-C command is intended to interrupt the command, and that means you cannot send the byte 0x03 un-escaped. At the end I thought about it, and decided that CircuitPython is powerful enough that just sending the commands in ASCII wouldn’t be an issue. So I decided to write a simplistic Flask app that would take a request over HTTP and send the command via the serial port. It worked, sort of. Sometimes while debugging I would end up locking the device (a Trinket M0) in the REPL, and that meant the commands wouldn’t be sent.

The solution I came up with was to reset the board every time I started the app, by sending Ctrl-C and Ctrl-D (0x03, 0x04) to force the board to reset. It worked much better.

Not-Quite-Remote Controlled HDMI Switch

After that worked, the problem was ensuring that the commands sent actually worked. The first component I needed to send the commands to was the HDMI switch. It’s a no-brand AliExpress-special HDMI switch. It has one very nice feature for what I need to do right now. It obviously has an infrared remote control – one of those thin, plasticky domes one – but it particularly has the receiver for it on a cord, which is connected with a pretty much standard 3.5mm “audio jack”.

This is not uncommon. Randomly searching Amazon or AliExpress for “HDMI switch remote” can find you a number of different, newer switches that use the same remote receiver, or something very similar to it. I’m not sure if the receivers are compatible between each other, but the whole idea is the same: by using a separate receiver, you can stick the HDMI switch behind a TV, for instance, and just make the receiver poke from below. And most receivers appear to be just a dome-encased TSOP17xx receiver, which is a 3-pin IC, which works great for a TRS.

When trying this out, I found that what I could do would be to use a Y-cable to allow both the original receiver and my board to send signals to the switch — at which point, I can send in my own pulses, without even bothering with the carrier frequency (refer to the previous post for details on this, it’s long). The way the signal is sent, the pulses need to ground the “signal” line (that is usually at 5V); to avoid messing up the different supplies, I paired it on an opto-coupler, since they are shockingly cheap when buying them in bulk.

But now that I tried setting this up with an input selection, I found myself not able to get the switch to see my signal. This turned out to require an annoying physical debugging session with the Saleae and my TRRS-to-Saleae adapter (that I have still not released, sorry folks!), which showed I was a bit off on the timing of the NEC protocol the switch used for the remote control. This is now fixed in the pysirc library that generates the pulses.

Once I got the input selector working for the switch with the Flask app, I turned to Home Assistant and added a custom component that exposes the switch as a “media_player” platform. In a constant state of “Idle” (since it doesn’t have a concept of on or off), it allowed me and my wife to change the input while seeing the names of the devices, without hunting for the tiny remote, and without having to dance around to be seen by the receiver. It was already a huge improvement.

But it wasn’t quite enough where I wanted it to be. In particular, when our family calls on Messenger, we would like to be able to just turn on the TV selected to the right input. While this was partially possible (Google Assistant can turn on a TV with a Chromecast), and we could have tried wiring up the Nabu Casa integration to select the input of the HDMI switch, it would have not worked right if the last thing we used the TV for was the Nintendo Switch (not to be confused with the HDMI switch) or for Kodi — those are connected via a Yamaha receiver, on a different input of the TV set!

Enter Sony

But again, this was supposed to be working — the adapter board included a connection for an infrared LED, and that should have worked to send out the Sony SIRC commands. Well, except it didn’t, and that turned out to be another wild goose chase.

First, I was afraid that when I fixed the NEC timing I broke the SIRC ones — but no. To confirm this, and to make the rest of my integration easier, I took the Feather M4 to which I hard-soldered a Sony-compatible IR LED, and wrote what is the eponymous software defined remote control: a CircuitPython program that includes a few useful commands, and abstractions, to control a Sony device. For… reasons, I have added VCR as the only option beside TV; if you happen to have a Bluray player by Sony, and you want to figure out which device ID it uses, please feel free.

It might sound silly, but I remember seeing a research paper in UX from the ’90s of using gesture recognitions on a touchpad on a remote control to allow more compact remote controls. Well, if you wanted, you could easily make this CircuitPython example into a touchscreen remote control for any Sony device, as long as you can find all the right device IDs, and hard code a bunch of additional commands.

So, once I knew that at least on the software side I was perfectly capable of control the Sony TV, I had to go and do more hardware debugging, with the Saleae, but this time with the probes directly on the breadboard, as I had no TRS cable to connect to. And that was… a lot of work, to rewire stuff and try.

The first problem was that the carrier frequency was totally off. The SIRC protocol specifies a 40kHz carrier frequency, which is supposedly easier to generate than the 38kHz used by NEC and others, but somehow the Saleae was recording it as a very variable frequency that oscillated between 37kHz and 41kHZ. So I was afraid that trying to run two PWM outputs on the Trinket M0 was a bad idea, even if one of them was set to nought hertz — as I said, the HDMI switch didn’t need a carrier frequency.

I did toy briefly with the idea of generating the 40kHz carrier wave separately, and just gating it to the same type of signal I used for the HDMI switch. Supposedly, 40kHz generators are easy, but at least for the circuits I found at first glance, it requires a part (640kHz resonator) that is nearly impossible to find in 2020. Probably fell out of use. But as it turn out it wouldn’t have helped.

Instead, I took another Feather. Since I ran out of M4, except for the one I hardwired already an IR LED to, I instead pulled up the nRF52840 that I bought and barely played with. This should have been plenty capable to give me a clean 40kHz signal and it indeed was.

At that point I noticed another problem, though: I totally screwed up the adapter board. In my Feather M4, the IR LED was connected directly between 3V and the transistor switching it. A bit out of spec, but not uncommon given that it’s flashed for very brief impulses. On the other hand when I designed the adapter, I connected it to the 5V rail. Oops, that’s not what I was meant to be doing! And I did indeed burn out the IR LED with it. So I had to solder a new one on the cable.

Once I fixed that, I found myself hitting another issue: I could now turn on and off the TV with my app, but the switch stopped responding to commands either from the app or from the original remote! Another round of Saleae (that’s probably one of my favourite tools — yes I splurged when I bought it, but it’s turning out to be an awesome tool to have around, after all), and I found that the signal line was being held low — because the output pin is stuck high…

I have not tried debugging this further yet — I can probably reproduce this without my whole TV setup, so I should do that soonish. It seems like opening both lines for PWM output causes some conflicts, and one or the other end up not actually working. What I solved this with was only allowing one command before restarting the Feather. It meant taking longer to complete the commands, but it allowed me to continue with my life without further pain.

One small note here: since I wasn’t sure how Flask concurrency would interact with accessing a serial port, I decided to try something a bit out of the ordinary, and set up the access to the Feather via an Actor using pykka. It basically means leaving one thread to have direct access to the serial port, and queue commands as messages to it. It seems to be working fine.

Wrapping It All Up

Once the app was able to send arbitrary commands to the TV via infrared, as well as changing the input of the HDMI, I extended the Home Assistant integration to include the TV as a “media_player” entity as well. The commands I implemented were Power On and Off (discrete, rather than toggle, which means I can send a “Power On” to the TV when it’s already on and not bother it), and discrete source selection for the three sources we actually use (HDMI switch, Receiver, Commodore 64). There would be a lot more commands I could theoretically send, including volume control, but I can already access those via the receiver, so there’s no good reason to.

After that it was a matter of scripting some more complicated acts: direct selection of Portal, Chromecast, Kodi, and Nintendo Switch (which are the four things we use the most). This was easy at that point: turn on the TV (whether it was on or not), select the right input on either the receiver or the switch, then select the right input ion the TV. The reason why the order seems a bit strange is that it takes a few seconds for the TV to receive commands after turning on, but by doing it this way we can switch between Chromecast and Portal, or Nintendo Switch and Kodi, in pretty much no time.

And after that worked, we decided the $5/month to Nabu Casa were worth it, because that allows us to ask Alexa or Google Assistant to select the input for us, too.

Eventually, this lead me to replace Google’s “Turn off the TV” command in our nightly routine to trigger a Home Assistant script, too. Previously, it would issue the command to the Chromecast, routing through the whole Google cloud services between the device that took the request and the Chromecast. And then the Chromecast would be sending the CEC command to power off… except that it wouldn’t reach the receiver, which would stay on for another two hours until it finally decided it was time to turn off.

With the new setup, Google is triggering the Home Assistant script, and appears to do that asynchronously. Then Home Assistant sends the request to my app, that then sends it to the Feather, that sends the power off to the TV… which is also read by the receiver. I didn’t even need to send the power off command to the receiver itself!

All in all, the setup is satisfying.

What remains to be done is to try exposing a “Media Player” to Google Home, that is not actually any of the three “media_player” entities I have, but is a composite of them. That way, I could actually just expose the different input trees as discrete inputs to Google, and include the whole play, pause, and volume control that is currently missing from the voice controls. But that can wait.

Instead, I should probably get going at designing a new board to replace the breadboard mess I’m using right now. It’s hidden away enough that it’s not in our face (unlike the Birch Books experiments), but I would still like having a more… clean setup. And speaking of that, I really would love if someone already contributed an Adafruit Feather component for EAGLE, providing the space for soldering in the headers, but keeping the design referencing the actual lines as defined in it.

Senior Engineering: Open The Door, Move Away

As part of my change of bubble this year, I officially gained the title of “Senior” Engineer. Which made me take the whole “seniority” aspect of the job with more seriousness than I did before. Not because I’m aiming at running up the ladder of seniority, but because I feel it’s part of the due diligence of my job.

I have had very good examples in front of me for most of my career — and a few not great ones, if I am to be honest. And so I’ve been trying to formulate my own take of a senior engineer based on these. You may have noticed me talking about adjacent topics in my “work philosophy” tag. I also have been comparing this in my head with my contributions to Free Software, and in particular to Gentoo Linux.

I have retired from Gentoo Linux a few years ago, but realistically, I’ve stopped being actively involved in 2013, after joining the previous bubble. Part of it was a problem with contributing, part of it was a lack of time, and part of it was having the feeling that something was off. I start to feel I have a better impression now of what it is, and it relates to that seniority that I’m reflecting on.

You see, I worked on Gentoo Linux a little longer than I worked at the previous bubble, and as such I could say that I became a “senior developer” by tenure, but I didn’t really gain the insight to become a “senior developer” in deeds, and this is haunting me because I feel it was a wasted opportunity, despite the fact that it taught me many of the things that I needed to be even barely successful in my current job.

It Begins Early

My best guess is that I started working on Gentoo Linux when I was fairly young and with pretty much no social experience. Which combined with the less-than-perfect work environment of the project, had me develop a number of bad habits that took a very long time to grow out of. That is not to say that age by itself is a significant factor in this — I still resent the remark from one of the other developers that not having kids would make me a worse lead. But I do think that if I didn’t grow up to stay by myself in my own world, maybe I would have been able to do a better job.

I know people my age and younger that became very effective leaders years ago — they’ve got the charisma and the energy to get people on board, and to have them all work for a common goal in their own way. I don’t feel like I ever managed that, and I think it’s because for the longest time, the only person who I had to convince to do something was… myself.

I grew up quite lonely — in elementary school, while I can stay I did have one friend, I didn’t really join other kids It’s a bit of a stereotype for the lonely geek, but I have been made fun since early on about my passion for computers, and for my dislike of soccer – I feel a psychiatrist would have a field day to figure out that and the relationship with my father – and I failed at going to church and Sunday school, which ones the only out-of-school mingling for most of the folks around.

Nearly thirty years later I can tell you that the individualism that I got out of this, while having given me a few headstarts in life when it comes to technical knowledge, it held me back long term on the people skill needed to herd the cats and multiply my impact. It’s not by chance that I wrote about teamwork and, without using the word, individualism.

Aside: I’m Jealous of Kids These Days

As an unrelated aside, this may be the reason why I don’t have such a negative view of social networks in general. It was something I was actually asked when I switched jobs, on what my impression of the current situation is… and my point rolls back to that: when I was growing up we didn’t have social networks, Internet was a luxury, and while, I guess, BBSes were already a thing, they would still have been too expensive for me to access. So it took me until I managed to get an Internet connection and discover Usenet.

I know there’s a long list of issues with all kind of social networks: privacy, polarisation, fake news, … But at the same time I’m glad that it makes it much more approachable for kids nowadays, who don’t fit with the crowd in their geographical proximity, to reach out to friendlier bunches. Of course it’s a double-edged sword as it also allows for bullies to bully more effectively… but I think that’s much more of a society-at-large problem.

The Environment Matters

Whether we’re talking about FLOSS projects, or different teams at work, the environment around an individual matter. That’s because the people around them will provide influence, both positive and negative. In my case, with hindsight, I feel I hanged around the wrong folks too long, in Gentoo Linux, and later on.

While a number of people I met on the project have exerted, again with hindsight, a good, positive influence in my way of approaching the world, I also can tell you now that there’s some “go-to behaviours” that go the wrong way. In particular, while I’ve always tended to be sarcastic and an iconoclast, I can tell you that in my tenure as a Gentoo Linux developer I crossed the line from “snarky” to “nasty” a lot of times.

And having learnt to avoid that, and keeping in check how close to that line I get, I also know that it is something connected to the environment around me. In my previous bubble, I once begged my director to let me change team despite having spent less than the two years I was expected to be on it. The reason? I caught myself becoming more and more snarky, getting close to that line. It wouldn’t have served either me or the company for me to stay in that environment.

Was it a problem with the team as a whole? Maybe, or maybe I just couldn’t fit into it. Or maybe it was a single individual that fouled the mood for many others. Donnie’s talk does not apply only to FLOSS projects, and The No Asshole Rule is still as relevant a book as ever in 2020. Just like in certain projects, I have seen teams in which certain areas were explicitly walked away from by the majority of the engineers, just to avoid having to deal with one or two people.

Another emergent behaviour with this is the “chosen intermediate person” — which is a dysfunction I have seen in multiple projects, and teams. When a limited subset of team members are used to “relate” to another individual either within, or outside, the team. I have been that individual in the first year of high school, with the chemistry teacher — we complained loudly about her being a bad teacher, but now I can say that she was probably a bigger expert in her field than most of the other chemistry teachers in the school, but she was terrible with people. Since I was just as bad, it seemed like I was the best interface with her, and when the class needed her approval to go on a fieldtrip, I was “volunteered” to be the person going.

I’ll get back later on a few more reasons why tolerating “brilliant but difficult to work with” people in a project or team is further unhealthy, but I want to make a few more points here, because this can be a contentious topic due to cultural differences. I have worked with a number of engineers in the past that would be described as assholes by some, and grumpy by others.

In general, I think it’s worth giving a benefit of the doubt to people, at first — but make sure that they are aware of it! Holding people to standards they are not aware of, and have no way to course-correct around, is not fair and will stir further trouble. And while some level of civility can be assumed, in my experience projects and teams that are heavily anglophones, tend to assume a lot more commonality in expectation than it’s fair to.

Stop Having Heroes

One of the widely known shorthands at the old bubble was “no heroes” — a reference to a slide deck from one of the senior engineers in my org on the importance of not relying on “heroes” looking after a service, a job, or a process. Individuals that will step in at any time of day and night to solve an issue, and demonstrate how they are indispensable for the service to run. The talk is significantly more nuanced than my summary right now, so take my words with a grain of salt of course.

While the talk is good, I have noticed a little too often the shorthand used to just tell people to stop doing what they think is the right thing, and leave rakes all around the place. So I have some additional nuances for it of my own, starting with the fact that I find it a very bad sign when a manager uses the shorthand with their own reports — that’s because one of my managers did exactly that, and I know that it doesn’t help. Calling up “no heroes” practice between engineers is generally fair game, and if you call up on your own contributions, that’s awesome, too! «This is the last time I’m fixing this, if nobody else prioritizes this, no heroes!»

On the other hand, when it’s my manager telling me to stop doing something and “let it break”, well… how does that help anyone? Yes, it’s in the best interest of the engineer (and possibly the company) for them not to be the hero that steps in, but why is this happening? Is the team relying on this heroism? Is the company relying on it? What’s the long-term plan to deal with that? Those are all questions that the manager should at least ask, rather than just tell the engineer to stop doing what they are doing!

I’ve been “the hero” a few times, both at work and in Gentoo Linux. It’s something I always have been ambivalent about. From one side, it feels good to be able to go and fix stuff yourself. From the other hand, it’s exhausting to feel like the one person holding up the whole fort. So yes, I totally agree that we shouldn’t have heroes holding up the fort. But since it still happens, it can’t be left just up to an individual to remember to step back at the right moment to avoid becoming a hero.

In Gentoo Linux, I feel the reason why we ended up with so many heroes was the lack of coordination between teams, and the lack of general integration — the individualism all over again. And it reminds me of a post from a former colleague about Debian, because some of the issues (very little mandated common process, too many different ways to do the same things) are the kind of “me before team” approaches that drive me up the wall, honestly.

As for my previous bubble, I think the answer I’m going to give is that the performance review project as I remember it (hopefully it changed in the meantime) should be held responsible for most of it, because of just a few words: go-to person. When looking at performance review as a checklist (which you’re told not to, but clearly a lot of people do), at least for my role, many of the levels included “being the go-to person”. Not a go-to person. Not a “subject matter expert” (which seems to be the preferred wording in my current bubble). But the go-to person.

From being the go-to person, to being the hero, and build up a cult of personality, the steps are not that far. And this is true in the workplace as well as in FLOSS projects — just think, and you probably can figure out a few projects that became synonymous with their maintainers, or authors.

Get Out of The Way

What I feel Gentoo Linux taught me, and in particular leaving Gentoo Linux taught me, is that the correct thing for a senior engineer to do is to know when to bow out. Or move onto a different project. Or maybe it’s not Gentoo Linux that taught me that.

But in general, I still think this is the most important lesson is to know how to open the door and get out of the way. And I mean it, that both parts are needed. It’s not just a matter of moving on when you feel like you’ve done your part — you need to be able to also open the door (and make sure it stays open) for the others to pass through it, as well. That means planning to get out of the way, not just disappearing.

This is something that I didn’t really do well when I left Gentoo Linux. I While I eventually did get out of the way, I didn’t really fully open the door. I started, and I’m proud of that, but I think I should have done this better. The blogs documenting how the Tinderbox worked, as well as the notes I left about things like the USE-based Ruby interpreter selection, seems to have been useful to have others pick up where i left… but not in a very seamless way.

I think I did this better when I left the previous bubble, by making sure all of the stuff I was working on had breadcrumbs for the next person to pick up. I have to say it did make me warm inside to receive a tweet, months after leaving, from a colleague announcing that the long-running deprecation project I’ve worked on was finally completed.

It’s not an easy task. I know a number of senior engineers who can’t give up their one project — I’ve been that person before, although as I said I haven’t really considered myself a “senior” engineer before. Part of it is wanting to be able to keep the project working exactly like I want it to, and part of it is feeling attached to the project and wanting to be the person grabbing the praise for it. But I have been letting go as much as I could of these in the past few years.

Indeed, while some projects thrive under benevolent dictators for life, teams at work don’t tend to work quite as well. Those dictators become gatekeepers, and the projects can end up stagnating. Why does this happen more at work than in FLOSS? I can only venture a guess: FLOSS is a matter of personal pride — and you can “show off” having worked on someone else’s project at any time, even though it might be more interesting to “fully make the project one’s own”. On the other hand, if you’re working at a big company, you may optimise working on projects where you can “own the impact” for the time you bring this up to performance review.

The Loadbearing Engineer

When senior engineers don’t move away after opening the door, they may become “loadbearing” — they may be the only person knowing how something works. Maybe not willingly, but someone will go “I don’t know, ask $them” whenever a question about a specific system comes by.

There’s also the risk that they may want to become loadbearing, to become irreplaceable, to build up job security. They may decide not to document the way certain process runs, the reason why certain decisions were made, or the requirements of certain interfaces. If you happen to want to do something without involving them, they’ll be waiting for you to fail, or maybe they’ll manage to stop you from breaking an important assumption in the system at the last moment. This is clearly unhealthy for the company, or project, and risky for the person involved, if they are found to not be quite as indispensable.

There’s plenty of already written on the topic of bus factor, which is what this fits into. My personal take on this is to make sure that those who become “loadbearing engineers” are made sure to be taking at least one long vacation a year. Make sure that they are unreachable unless something goes very wrong, as in, business destroying wrong. And make sure that they don’t just happen to mark themselves out of office, but still glued to their work phone and computer. And yes, I’m talking about what I did to myself a couple of times over my tenure at the previous bubble.

That is, more or less, what I did by leaving Gentoo as well — I’ve been holding the QA fort so long, that it was a given that no matter what was wrong, Flameeyes was there to save the day. But no, eventually I wasn’t, and someone else had to go and build a better, scalable alternative.

Some of This Applies to Projects, Too

I don’t mean it as “some of the issues with engineers apply to developers”. That’s a given. I mean that some of the problems happen to apply to the projects themselves.

Projects can become the de-facto sole choice for something, leaving every improvement behind, because nobody can approach them. But if something happens, and they are not updated further, it might just give it enough of a push that they can get replaced. This has happened to many FLOSS projects in the past, and it’s usually a symptom of a mostly healthy ecosystem.

We have seen how XFree86 becoming stale lead to Xorg being fired up, which in turn brought us a significant number of improvements, from the splitting apart of the big monolith, to XCB, to compositors, to Wayland. Apache OpenOffice is pretty much untouched for a long time, but that gave us LibreOffice. GCC having refused plugins for long enough put more wood behind Clang.

I know that not everybody would agree that the hardest problems in software engineering are people problems, but I honestly have that feeling at this point.

Computer-Aided Software Engineering

Fourteen years ago, fresh of translating Ian Sommerville’s Software Engineering (no, don’t buy it, I don’t find it worth it), and approaching the FLOSS community for the first time, I wrote a long article for the Italian edition of Linux Journal on Computer-Aided Software Engineering (CASE) tools. Recently, I’ve decided to post that article on the blog, since the original publisher is gone, and I thought it would be useful to just have it around. And because the OCR is not really reliable, I ended up having to retype a good chunk of it.

And that reminded me of how, despite me having been wrong a lot of times before, I still think some ideas stuck with me and I still find them valid. CASE is one of those, even though a lot of times we’re not really talking of the tools involved as CASE.

UML is the usual example of a CASE tool — it confuses a lot of people because the “language” part suggests it’s actually used to write programs, but that’s not what it is for: it is a way to represent similar concepts in similar ways, without having to re-explain the same iconography: sequence diagrams, component diagrams, entity-relationship diagrams standardise the way you express certain relationship and information. That’s what it is all about — and while you could draw all of those diagrams without any specific tool, with either LibreOffice Draw, or Inkscape, or Visio, specific tools for UML are meant to help (aid) you with the task.

My personal preferred tool for UML is Visual Paradigm, which is a closed-source, proprietary solution — I have not found a good open source toolkit that could replace it. PlantUML is an interesting option, but it doesn’t have nearly all the aid that I would expect form an actual UML CASE tool — you can’t properly express relationships between different components across diagrams, as you don’t have a library of components and models.

But setting UML aside, there’s a lot more that should fit into the CASE definition. Tools for code validation and review, which are some of my favourite things ever, are also aids to software engineering. And so are linters, formatters, and sanitizers. It’s easy to just call them “dev tools”, but I would argue that particularly when it comes to automating the code workflows, it makes sense to consider them CASE tools, and reduce the stigma attached to the concept of CASE, particularly in the more “trendy” startups and open source — where I still feel push backs at using UML, or auto-formatters, and integrated development environments.

Indeed, for most of these tools, they are already considered their own category: “developer productivity”. Which is not wrong, but it does reduce significantly the impact they have — it’s not just about developers, or coders. I like to say that Software Engineering is a teamwork practice, and not everybody on a Software Engineering team would be a coder — or a software engineer, even.

A proper repository of documents, kept up to date with the implementation, is not just useful for the developers that come later, and need to implement features that integrate with the already existing system. It’s useful for the SRE/Ops folks who are debugging something on fire, and are looking at the interaction between different components. It’s useful to the customer support folks who are being asked why only a particular type of requests are failing in one of the backends. It’s useful to the product managers to have clear which use cases are implemented for the service, and which components are involved in specific user journeys.

And similarly it extends for other type of tools — A code review tool that can enforce updates to the documentation. A dependency tracking system that can match known vulnerabilities. A documentation repository that allows full reviews. An issue tracker system that can identify who most recently changed code that affects the component an issue was filed on.

And from here you can see why I’m sceptical about single-issue tools being “good enough”. Without integration, these tools are only as useful as the time they save, and often that means they are “negative useful” — it takes time to set up the tools, to remember to run them, and to address their concern. Integrated tools instead can provide additional benefits that go beyond their immediate features.

Take a linter as an example: a good linter with low false positive rate is a great tool to make sure your code is well written. But if you have to manually run it, it’s likely that, in a collaborative project, only a few people will be running it after each change, slowing them down, while not making much of a difference for everyone else. It gets easier if the linter is integrated in the editor (or IDE), and even easier if it’s also integrated as part of code review – so those who are not using the same editor can still be advised by it – and it’s much better if it’s integrated with something like pre-commit to make it so the issues are fixed before the review is sent out.

And looking at all these pieces together, the integrations, and the user journeys, that is itself Software Engineering. FLOSS developers in general appears to have built a lot of components and tools that would allow building those integrations, but until recently I would have said that there’s been no real progress in making it proper software engineering. Nowadays, I’m happy to see that there is some progress, even as simple as EditorConfig, to avoid having to fight over which editors to support in a repository, and which ones not to.

Hopefully this type of tooling is not going to be relegated to textbooks in the future, and we’ll also be used to have a bunch of CASE tools in our toolbox, to make software… better.

FastMail 9 Months Review

You may remember that earlier this year, for reasons weakly linked to my change of employer, I have been looking to change my email provider. After many years using Gmail via what is now called Google Workspace, the complete disconnection to consumer features including Google One got to me, and I looked for an alternative.

At the time I considered both ProtonMail and FastMail as valid options — the former having been touted all over the place for their security features and privacy. After trying them both, and originally choosing the former, I found out that it would just not scale for the amount of email I had in Gmail, and instead switched over to FastMail, that seemed to fit much better for my needs.

Now it’s over nine months later, and I thought it’s a good time to check-in with my choice, and provide an update for the next person looking to choose an email provider. But before I get into the details, I think it’s also important to qualify that this year has been anything but “business as usual” — not just because of the lockdown, but also because me and my wife ended up moving to a new apartment, and oh, yeah I changed jobs. This means that my email usage pattern changed considerably, and that changed which features and values of FastMail I stressed on.

So let’s start with the elephant in the room, that is the offline availability — one of the things I noted earlier on is that the default Android app of FastMail is not much more than a WebView on a portable web app. It might not be technically precise, but I still feel it’s a good description for it. It does have some offline features, but you can’t “sync offline” a whole folder of email to have access to without a network connection. This would have probably been a much bigger, if not even deal-breaker, problem… if we had been traveling, or if I had been commuting two hours a day every day. But given the lockdown, this part is not really affecting me, nearly at all.

I guess that if this is a significant enough feature you need, using a “proper” client on the phone would be a good option — and that might have been something I’d have tried in different circumstances. As far as I can tell, FastMail does not implement OAuth2-based IMAP login flows, so you still need application-specific passwords for this to work, which I’m not really fond of — if they did, and they also supported U2F from phones (why is it not supported is not clear to me), that would be a total non-issue. As it is, it’s very low in my priorities though, so I’m not complaining.

Let’s segue that into security — as I said I’m still not sure why U2F is not supported on phones, and why I had to enable SMS 2FA to login on my Android phone. But on the other hand, it works fine on computers, so there’s that going on for us. FastMail is also the first provided that I see taking Application-Specific Passwords to the letter: when you create the password you can decide which scopes to attach to it, so if you want a send-only password, you can. And that makes life easier, and not just for developers needing to send kernel patches out.

So there’s space for improvement (OAuth2 and U2F, folks?), but otherwise I’m feeling confident that FastMail knows what they are doing and are taking compromises intentionally, rather than by coincidence.

On the Web UI side, things may be a bit bumpy — the UI feels “less fresh” than Gmail or Inbox, but that’s not entirely bad: it’s definitely lighter to use, and it makes searches work nicely. I think what I am missing from it is a more “native feeling” — I miss the contextual menu to archive or delete threads. The drag-and-drop works great though, so there’s that about it. Some of the choices of where to find actions (such as report spam/report non-spam) are a bit confusing, but otherwise it’s all very serviceable, and very customizable, too! The (optional) Gravatar integration is also fairly handy when talking with people in the FLOSS community, although I wish more services decided to provide a “service identity”.

If anything, there are two features I really miss from Gmail: the first is something that FastMail appears to have tried, and then given up on (for now), and that is the portable web app mode on desktop. For a while I had a FastMail window rather than a tab, and that was easier to handle for me. Having notifications running in the background without needing to have the tab open in my browser just make my flow easier (which is why I do that with Twitter, Home Assistant, and Google Maps among the others), and FastMail would be an obvious option there.

The second feature is the composer “pop up”. When having to deal with the horrible old property management company for the flat we lived in, I often had to go and refer back to previous threads, because they rarely would follow-up in writing, and the reference person changed all the time. I ended up having to open a second (or sometimes third) window open with the various searches, while composing a long message in the first one.

But otherwise? I haven’t been missing Gmail at all, despite the significantly higher amount of email I had to read and write due to the work and flat changes. Filters work fine, the self-deleting folders (or labels, you can choose which version you want!) are an awesome tool to keep the sheer amount of email under control.

Okay, sure, the “Smart Mail” features that I never had would have been nice at times — fill in the calendar with events just by receiving them! But as it turns out, GSuite/Google Workspace never got that feature in for their users. It’s long been a consumers-only feature, and so I never got into the habit of relying on it. And even when I had access to it, the fact that it required explicit enrolling of the providers by Google, it meant that only those services that had enough critical mass, or were used by a Googler, would be covered. It would be so much better if instead there would be better support for providers to attach, say, ICS files when they send you time-based information, such as expected delivery (for groceries, online shopping), or appointments (for hospitals and so on — although this is unlikely to happen: while NHS is very happy to remind me of all appointments in full by SMS, email messages only contains a “Please click here and put some obvious information about you to see the medical letter with the appointment”, alas.)

So, I clearly am a happy customer; even more so now that I have experience Outlook 365 (I can’t make heads or tails of that interface!) And I would say that, if you’re looking for an email-only provider, FastMail is the best option I have seen. Depending on what you’re looking for, Google Workspace might still a better value for money (with the whole collaboration suite), but I have a couple of former customers that I would probably have signed up on FastMail rather than Google, if they asked right now.

To close up, yes, there’s a referral programme nowadays — if you’re interested, this is my affiliate link. You’re welcome to use it, but it’s not the reason why I’m suggesting to use FastMail. I’m suggesting it because it is a terrific service.

You can’t program the world to suit you

Last year, I was drafting more notes regarding the Free Software for SMB that I talked about before. While doing so I recognized that one of the biggest takeaway for myself is that successfully making a software project thrive takes a lot more than just good programmers, developers, designers. If you have customers you need people who know how to make a business work, you need people who can market your product, and you need people to remind you what the customers actually want as well as what they need.

It’s not an entirely new lesson — I wrote (in Italian) about helping Free Software without being a programmer fifteen years ago. I also wrote about the importance of teamwork two years ago. And I have spent a good chunk of my opensource and professional careers knee-deep in documentation matters.

I actually want to go back to the tweet that spawned the teamwork post:

Most things don’t work the way I think they work. That’s why I’m a programmer, so I can make them work the way I think they should work.

This is not meant to single out the author of the quoted phrase, but just to take it as an example of a feeling I get from many talks, and discussions, and in general just people out there. The idea that you can tech your way out of a problem. That by being a programmer you can change the way most things work.

And that’s not true, because the world is not running on servers, unless you found the Repository and I don’t know that. Indeed wielding “the power of programming”, thinking of changing the world just with that, sounds to me like a recipe for either failure or disaster.

I heard all kind of possible “solutions” to this — from insisting on teaching ethics in Software Engineering courses (with reasonable doubts about it), to regulating the heck out of any action businesses can take. I think the closest I have seen to something I would like (with all my biases of course) would be to make sure there is a mix of non-programming subjects in every university or high school that teaches programming as well. But even that has its own limitations, and I can totally say that I would probably have been frustrated by that and just ignored everything that’s not programming-related, when I was that age.

To make the example of Italy, that is under political turmoils most of the time, I could see a number of critiques of (in my opinion horrible) politicians based on where they went to school. In particular I saw some left-wing intellectuals criticising ministers (who have enough to be criticised about in deeds) based on the fact that they didn’t study in a lyceum but rather on a technical (or professional) school. Well, turns out I studied at a tech school, and I studied basic economics and (very basic) civic education for two years, and I found out the hard way that I know how VAT works much better than most of my local acquaintances who got an university degree after a lyceum: they never were introduced to the concept of VAT, the difference between types of taxes, and so on.

You could argue that there is no reason to know this particular tidbit, which is where I’m actually going to end up: there is no perfect education, the same way as there is no perfect solution. People need to learn to work with each other and they should know how to play each other’s strengths instead.

What I really would like to see proposed more often is focusing a lot more on teamwork. And not in the sense of “Here’s a topic for research, now work on it with your team”, which I had to do in high school — many of us have had the experience of being the only person working for a group assignment. What I would have loved to have would be cross-school year-long projects. Not competitions, but rather something that requires more than one type of expertise: trying to get three programming students in a room to work together, in my experience, turned to either two of them slacking off, because one of them actually enjoy doing the work, or if you’re lucky having someone with actual leadership skills telling them how to do their job… but still gives the impression that you just need programmers to do something like that.

In hindsight I would have loved instead if I had a project shared with some of my colleagues from electronics, mechanical and business tech-schools. Come up with a solution for a problem, that requires hardware and software, and a product plan that would include optimising the bill of material for small batch production and still make profits.

Sounds complicated? It is. Having had my own company, alone, for four years, made it very clear that there is a lot more than just being a programmer if you want to succeed. If you want to change the world, and in particular if you want to make the world a better place, then it takes even more energy, and a bigger group of people who can work together.

It also takes leadership. And that’s not something that I feel can be taught, and it’s the one that makes the most difference on whether the change is for good or not. I’m not good at leading people. I don’t have the right mindset most likely. I have trouble rallying people towards a common goal. I know that. I just hope that at some point, when I’ll be looking at more meaning in my work, I’ll find the right leader that can take what I can add to a good team, and let me shine through that.

I know it’s going to be repeating myself, but that is also what I mean with “there is no perfect solution”. If we decided that leadership is something that is important to score people, whether it is with school results, or with performance review at work, then we would be pretty much excluding a significant part of the population: not everyone wants to be a leader, are people who don’t want to be a leader worth less to society? Hint: this is not far from the question of how many multiples of a line worker a CEO should be worth.

And if you need a proper example of how “tech will not solve it”, just look at 2020 in general: tech is not really solving the Covid-19 world crisis. It does help, of course: videopresence, social network and chat services (including my employer’s), online “tabletop” games, shared documents infrastructure, online shopping, and so on… they all allowed people, isolating or not, to feel closer together. But it did not solve the problem. Even if we including medical sciences as “tech”, they still have not managed to find a way to deal with the crisis, because the crisis is not just medical.

People don’t ignore the lockdown requirements because they don’t have enough tech: it’s because there are other things in this world! It’s one thing to talk to my mother on the big screen of Portal, and another thing to spend a week at her house — including the fact that I can’t fix her house’s wiring while physically in another country. And then there is the big elephant in the room: the economy — tech can’t solve that problem, people working in industries that had to shut down because of the lockdown can’t just be “teched” into new roles; they can’t magically be vaccinated overnight; they need political leaders to make tough decisions around supporting them.

So no, you can’t program the world to suit your needs. Great for you if you have more tools in your toolbox – and there’s a lot more use for even basic programming literacy that has nothing to do with working as a programmer – but that doesn’t make you super-human, nor it allows you to ignore what’s going on in the world. If “being a programmer” is providing a superiority complex, I feel it’s more related to the fact that we’ve been well paid for a number of years now, and money makes the difference.

But that’s a topic for an entirely new rant, later on.

Home Assistant and CGG1 Sensors

You may or may not know that I’m one of the few tech people who don’t seem to scream against IoT and I actually have quite a few “smart home” devices in my apartment — funnily enough, one less now that we don’t have normal heating at home and we had to give up the Nest Thermostat. One of the things that I have not set up until now was Home Assistant, simply because I didn’t really need anything from it. But when we moved to this flat, I felt the need to have some monitoring of humidity and temperature in the “winter garden” (a part of the flat that is next to the outside windows, and is not heated), since we moved up in floors and we no longer have the “repair” of being next to another building.

After talking with my friend Srdjan, who had experience with this, I settled on ordering Xiaomi-compatible “ClearGrass” CGG1 sensors from AliExpress. These are BLE sensors, which mean they can broadcast their readings without requiring active connections, and they have a nice eInk display, which wastes very little power to keep running. There’s other similar sensors, some cheaper in the short run, but these sounded like the right compromise: more expensive to buy, but cheaper to run (the batteries are supposed to last six months, and cost less than 50p).

Unfortunately, getting them to work turned out to be a bit more complicated than either of us planned at the beginning. The sensors arrived on a day off, less than two weeks after ordering (AliExpress is actually fairly reliable when ordering to London), and I spent pretty much the whole day, together with Srdjan over on videocall, to get them to work. To save this kind of pain for the next person who come across these issues, I decided to write this up.

Hardware Setup Advices

Before we dig into the actual configuration and setup, let me point out a few things about setting up these devices. The most annoying part is that the batteries are, obviously, isolated to avoid running out in shipping, but the “pull out tab” doesn’t actually pull out. It took quite a bit of force to turn the battery compartment door, open it up, and then re-set it in. Be bold in taking them out.

The other advice is not from me but from Srdjan: write the MAC address (or however it’s called in Bluetooth land) on each of the sensors. Because if you only have one sensor, it’s very easy to tell which one it is, but if you bought more (say, four like I did), then you may have issues later to identify which one is which. So it’s easier to do while you’re setting them up, by turning them on one at a time.

To tell what’s the address, you can either use an app like Beacon Simulator, and listen to the broadcast, or you can wait until you get to a point when you’re ready to listen to the broadcast, later in the process. I would recommend the former, particularly if you live in a block of flats. Not only there’s a lot of stuff that broadcast BLE beacons, but nowadays pretty much every phone is broadcasting them as well due to the Covid 19 Exposure Notifications.

And to make sure that you don’t mix them up, I definitely suggest to use a label maker — and if you are interested in the topic, make sure to check out the Museum of Curiosity Series 13, Episode 6.

Finally, there’s a bit of a spoiler of where this whole process is going to end up going to — I’m going to explicitly suggest you avoid using USB Bluetooth dongles, and instead get yourself an ESP32 kit of some kind. ESP32 devkits are less than £10 on Amazon at the time of writing, and you can find them even cheaper on AliExpress — and they will be much more useful, as I’ll go ahead and tell you.

Home Assistant Direct Integration

So first of all, we spent a lot of time mucking around with the actual Home Assistant integrations. There’s a mitemp_bt sensor integration in Home Assistant, but it’s an “active” Bluetooth implementation, that is (supposedly) more power hungry, and require associating the sensors with the host running Home Assistant. There’s also an alternative implementation, that is supposed to use passive BLE scans.

Unfortunately, even trying to install the alternative implementation turned out to be annoying and difficult — the official instructions appears to expect you install another “store-like” interface on top of Home Assistant, which appears to not be that easy to do when you use their “virtual appliance” image in the first place. I ended up hacking it up a bit, but got absolutely nothing out of it: there isn’t enough logging to know what’s going on at any time, and I couldn’t tell if any packet was even received and parsed.

There is also a clear divergence between the Home Assistant guidelines on how to build new integration, and the way the alternative implementation is written — one of the guides (which I can’t now find easily, and that might speak to the reason for this divergence) explicitly suggests not to write complex parsing logic in the integration, and instead build an external Python library to implement protocols and parsers. This is particularly useful when you want to test something outside of Home Assistant, to confirm it works first.

In this case, having a library (and maybe a command line tool for testing) would have made it easier to figure out if the problem with the sensors was that nothing was received, or that something was wrong with the received data.

This was made more annoying too by the fact that for this to work, you need a working Bluetooth adapter connected to your Home Assistant host — which in my case is a virtual machine. And the alternative implementation tells you that it might interfere with other Bluetooth integrations, so you’re suggested to keep multiple Bluetooth interfaces, one for each of the integrations.

Now this shouldn’t be too hard, but it is: the cheapest Bluetooth dongles I found on Amazon are based on Realtek chipsets, which while supported by (recent enough) Linux kernels, need firmware files. Indeed the one dongle I got requires Linux 5.8 or later, or it requests the wrong firmware file altogether. And there’s no way to install firmware files in the Home Assistant virtual appliance. I tried, quite a few times by now.

ESPHome Saves The Day

ESPHome is a project implementing firmware (or firmware building blocks, rather) for ESP8266 and ESP32 boards and devices, that integrates fairly easily with Home Assistant. And since ESP32 supports BLE, ESPHome supports Xiaomi-compatible BLE sensors, such as the CGG1. So the suggestion from Srdjan, which is what he’s been doing himself, is to basically use an ESP32 board as a BLE-to-WiFi bridge.

This was easy because I had a bunch of ESP32 boards in a box from my previous experiments with acrylic lamps, but as I said they are also the same price, if not cheaper, than Bluetooth dongles. The one I’m using is smaller than most breadboard-compatible ESP32 boards and nearly square — it was a cheap option at the time, but I can’t seem to find one available to build now. It’s working out well by size, because it also doesn’t have any pin headers soldered, so I’m just going to double-side-tape it to something and give it a USB power cable.

But it couldn’t be as easy as to follow the documentation, unfortunately. While configuring the ESPhome is easy, and I did manage to get some readings almost right away, I found that after a couple of minutes, it would stop seeing any signal whatsoever from any of the sensors.

Digging around, I found that this was not uncommon. There’s two ESPhome issues from 2019: #317 and #735 that report this kind of problems, with no good final answer on how to solve them, and unfortunately locked to collaborators, so I can’t leave breadcrumbs for the next person in there — and it’s why I am now writing this, hopefully it’ll save headaches for others.

The problem, as detailed in #735, is that the BLE scan parameters need to be adjusted to avoid missing the sensors’ broadcasts. I tried a few combinations, and at the end found that disabling the “active” scan worked — that is, letting the ESP32 passively listen to the broadcasts, without trying to actively scan the Bluetooth channels seemed to let it stay stable, now for over 24 hours. And it should also be, as far as I can tell, less battery-draining.

The final configuration looks something like this:

esp32_ble_tracker:
  scan_parameters:
    duration: 300s
    window: 48ms
    interval: 64ms
    active: False

sensor:
  - platform: xiaomi_cgg1
    mac_address: "XX:XX:XX:XX:XX:XX"
    temperature:
      name: "Far Corner Temperature"
    humidity:
      name: "Far Corner Humidity"
    battery_level:
      name: "Far Corner Battery Level"

The actual scan parameters should be, as far as I can tell, ignored when disabling the active scan. But since it works, I don’t dare to touch it yet. The code in ESPhome doesn’t make it very clear if changing those parameters when disabling active scan is entirely ignored, and I have not spent enough time going through the deep stack to figure this out for certain.

The only unfortunate option of having it set up this way, is that by default, Home Assistant will report all of the sensors in the same “room”, despite them being spread all over the apartment (okay, not all over the apartment in this case but in the winter garden).

I solved that by just disabling the default Lovelace dashboard and customizing it. Because turns out it’s much nicer to customize those dashboards than using the default, and it actually makes a lot more sense to look at it that way.

Looking To The Future

So now I have, for the first time, a good reason to use Home Assistant and to play around with it a bit. I actually have interesting “connected home” ideas to go with it — but they mostly rely on getting the pantograph windows working, as they don’t seem currently to be opening at all.

If I’m correct in my understanding, we’ll need to get the building managers to come and fix the electronic controls, in which case I’ll ask for something I can control somehow. And then it should be possible to have Home Assistant open the windows in the morning, assuming it’s not raining and the temperature is such that a bit of fresh air would be welcome (which is most of the summer, here).

It also gives me a bit more of an incentive to finish my acrylic lamps work — it would be easy, I think, to use one of those as a BLE bridge that just happens to have a NeoPixel output channel. And if I’m going “all in” to wire stuff into Home Assistant, it would also allow me to use the acrylic lamps as a way to signal for stuff.

So possibly expect a bit more noise on this front from me, either here or on Twitter.

Update 2020-12-06: The sensors I wrote about above are an older firmware CGG1 sensors that don’t use encryption. There appear to be a newer version out that does use encryption, and thus require extracting a bind key. Please follow the reported issue for updates.

RPC Frameworks and Programming Languages

RPC frameworks are something that I never thought I would be particularly interested in… until I joined the bubble, where nearly everything used the same framework, which made RPC frameworks very appealing. But despite both my previous and current employers releasing two similar RPC frameworks (gRPC and Apache Thrift respectively), they are not really that commonly used in Open Source, from what I can tell. D-Bus technically counts, but it’s also a bus messaging system, rather than a simpler point-to-point RPC system.

On the proprietary software side, RPC/IPC frameworks have existed for dozens of years: CORBA was originally specified in 1991, and Microsoft’s COM was released in 1993. Although these are technically object models rather than just RPC frameworks, they fit into the “general aesthetics” of the discussion.

So, what’s the deal with RPC frameworks? Well, in the general sense, I like to represent them as a set of choices already made for you: they select an IDL (Interface Description Language), they provide some code generation tool leveraging the libraries they select, and they decide how structures are encoded on the wire. They are, by their own nature, restrictive rather than flexible. And that’s the good thing.

Because if we considered the most flexible options, we’d be considering IP as an RPC framework, and that’s not good — if all we have is IP, it’s hard for two components developed in isolation to be able to talk together. That’s why we have higher level protocols, and that’s why even just using HTTP as an RPC protocol is not good enough: it doesn’t define anywhere close to the semantics you need to be able to use it as a protocol without knowing both client and server code.

And one of the restrictions that I think RPC frameworks are good for, is making you drop the convention of specific programming languages — or at least of whichever programming language they didn’t take after. Because clearly, various RPC frameworks inspire themselves from different starting languages, and so their conventions feel more or less at ease in each language depending on how far they are from the starting language.

So for instance, if you look at gRPC, errors are returned with a status code and a detailed status structure, while in Thrift you declare specific exception structures that your interfaces can throw. Both options make different compromise, and they require different amount of boilerplate code to feel more at ease with different languages.

There are programming languages, particularly in the functional family (I’m looking at you, Erlang!) that don’t really “do” error checking — if you made a mistake somewhere, you expect that some type of error will be raise/thrown/returned, and everything else will fall behind it. So an RPC convention with a failure state and a (Adam Savage voice) “here’s your problem” long stack trace would fit them perfectly fine.

This would be equivalent of having HTTP only ever return error codes 400 and maybe 500 — client or server error, and that’s about it. You deal with it, after all it’s nearly always a human in front of a computer looking at the error message, no? Well…

Turns out that being specific to a point of what your error messages are can be very useful, particularly when interacting at a distance (either physical distance, or the distance of not knowing the code of whatever you’re talking to) — which is now HTTP 401 is used to trigger an authentication request on most browsers. If you wanted to go a further step, you could consider a 451 response as an automated trigger to re-request the same page from a VPN in a different country (particularly useful with GDPR-restricted news sources in the USA, nowadays).

Personally, I think this is the reason why the dream of thin client libraries, in my experience, stays a dream. While, yes, with a perfectly specified RPC interface definition you could just use the RPC functions as if they were a library themselves… that usually means that the calls don’t “feel” correct for the language, for any language.

Instead, I personally think you need a wrapper library that can expose the RPC interfaces with a language-native approach — think builder paradigms in Java, and context managers in Python. Not doing so leads, in my experience, to either people implementing their own wrapper libraries you have no control over, or pretty bad code overall, because the people knowing the language refuse to touch the unwrapped client.

This is also, increasingly, relevant for local tooling — because honestly I’d rather have an RPC-based interface over Unix Domain Sockets (which allow you to pass authentication information) rather than running command line tools as subprocesses and trying to parse their output. And while for simpler services, signal-based communication or very simple “text” protocols would work just as well, there’s value in having a “lingua franca” to speak between different services.

I guess what I’m saying is that, unlike programming languages, I do think we should make, and stick to, choices on RPC systems. The fact that for the longest time most of Windows apps could share the same basic IPC/RPC system was a significant advantage (nowadays there’s… somewhat confusion at least in my eyes — and that probably has something to do with the amount of localhost-only HTTP servers that are running on my machines).

In the Open Source world, it feels like we don’t really seem to like the idea of taking options away – which was clearly visible when the whole systemd integration started – and that makes choices, and integrations, much harder. Unfortunately, that also means significantly higher cost to integrate components together — and a big blame game when some of the bigger, not-open players decide to make some of those choices (cough application-specific passwords cough).

Revisiting Open Source Washing Machines (in Theory)

A number of years ago I wrote a post about the idea of Free Software washing machines, using it as an allegory to point out how silly it might be to attack people for using products with closed-source firmware, when no alternative is readily available. At the time, the idea of wanting an open source washing machine was fairly ludicrous, despite me pointing out that it would have some interesting side effect with programmability.

Well, in 2020 I would actually suggest that I really wish we had more Open Source washing machines, and actually wish we had, in general, more hackable home appliances (or “white goods” as they are called over here), particularly nowadays that it seems like connected appliances are both trendy and the butt of many jokes.

It’s not just because having access to the firmware of everything is an interesting exercise, but because most appliances are lasting for years — which means they are “left behind” by technology moving on, including user interfaces.

Let’s take for instance washing machines — despite being older than the one in my apartment, my mother’s LG washing machine has a significantly more modern user interface. The Zanussi one we have here in London has one big knob to select the program – which is mostly vestigial from the time it would be spring-loaded and moving to select the various phases – and then a ton of buttons to select things like drying mode (it’s a washer/dryer combo), drying time, and the delay start (awesome feature). You can tell that the buttons were addition to an interface, and that the knob is designed to be as similar to the previous interface as possible. And turns out the buttons are not handy: both drying time and delay have only one button each — which means you can only increase those values: if you miss your target, you need to go back to zero and up again.

On the other hand, my mother’s LG also has a knob — but the knob is just a free-spinning rotary encoder connected to a digital controller. While her model is not a dryer, I’m reasonably sure that the machine has a delay start feature, which is configured by pressing one button and then rotating the wheel. A more flexible interface, with a display a bit more flexible than the two multi-segments that our current machine has, would do wonder to usability, and that’s without going into any of the possible features of a “connected appliance”. Observe-only, that is — I would still love seeing a notification on our phones when the washing machine completed, so that we don’t forget that we have clean clothes that need to be hanged to dry. Yes we actually forget sometimes, particularly before the pandemic if we left them to delay-run from the morning.

Replacing a washing machine just because the user interface is bad is a horrible thing to do for the planet. And in particular when living in rented accommodation, you own the white goods, and even when they are defective, you don’t get to choose them — you end up most of the time with whichever is the cheapest one in the shop, power efficiency be damned, since rarely the landlords are paying for electricity. So having hackable, modular washing machines would be just awesome: I could ask our landlord “Hey can I get a smartmodule installed for the washing machine? I’ll pay the £50 it costs!” (seriously, if it costs more than that, it would be a rip-off — most of the controls you need for this can be hardly more complicated than a Feather M4!)

Oh yeah and even if I just had access to the firmware of this washer/dryer I might be able to fix the bug where the “program finished” beeper does not wait for the door’s lock magnet to disengage before starting. The amount of times I need to set a timer to remind myself to go and take the towels out in five minutes is annoying as heck.

But it’s not just washing machines that would be awesome to be hackable and programmable. We have a smallish microwave and convection oven combo. I got it in Dublin, and I chose this model because it was recommended by an acquaintance for its insistent beeping when the timer completes. If you have ever experience hyperfocus at any degree, you probably understand why such a feature is useful.

But in addition to the useful feature, the oven comes with a bunch of pretty much useless ones. There’s a number of “pre-programmed” options for defrosting, or making pop-corns and other things like that, that we would never use. Not just because we don’t eat them, but also because they are rarely recommended — if you ever watch cooking channels such as How To Cook That, you find that professionals usually suggest specific way to use the microwave — including Ann Reardon’s “signature” «put it in the microwave for 30 seconds, stir, put it back for 30 seconds, stir, …».

And again in term of user interfaces, the way you configure the convection temperature is by clicking the “Convection” button to go from full power (200W) down — and if you got it wrong, oops! And then if you turn the knob (this time a free-spinning one, at least), you’re setting the timer, without pre-heating. If you want to pre-heat you need to cancel it all, and resume the thing, and… you see my point.

This is a very simple appliance, and it works perfectly fine. But if I could just open it and replace the control panel with something nicer, I would love to. I think that I would like to have something I can connect to a computer (or maybe connect an USB thumbdrive to), and configure my own saved parameters, selecting for instance “fish fingers” and “melted butter”, which are the more likely uses of the oven for us at home.

But again, this would require a significant change in the design of appliances, which I don’t think is going to happen any year now. It would be lovely, and I think that there might be a chance for Open Source and Open Hardware amateurs to at least show the possibility for it — but it’s the kind of project that I can only with for, with no hope to get to work on myself, not just for the lack of time but for the lack of space — if you wanted to try hacking on a washing machine, you definitely need a more spacious and populated workshop. My limit is still acrylic lamps.