What To Look For In A Glucometer

I received a question some time ago about which glucometer I would recommend, if any, and I thought I would put down some notes about this, since I do have opinions, but I also know full well that I’m not the right person to recommend medical devices in the first place.

So just we’re clear, I’m not a medical professional, and I can only suggest as for what to look for when choosing a blood sugar meter device. If your GP or diabetologist is recommending one in particular over others, I would recommend you follow their suggestion over mine.

So first of all, what are we talking about? I’m going to be focusing on blood glucometers exclusively, because the choice within CGMs (Continuous Glucose Meters) and “flash” meter solutions (such as the FreeStyle Libre) are more limited, and I have even less experience on those. I also found out of personal experience, together with talking with a number of other CGM users, that those tend to be a lot more temperamental and personal choices.

Blood glucometers, as the name implies, work by measuring the sugar in the blood by putting a drop of blood (usually taken by pricking a finger) onto a chemically reactive “strip” (with the exception of one device to my knowledge). I have over the years “reviewed” a number of meters on this blog, simply because I end up getting my hands on them out of hobby nowadays, to reverse engineer and implement support (when feasible) in my open source tooling.

Let me repeat: I do not have the technical expertise to judge the clinical effectiveness of glucometers. I do know that most of them have a wide margin on their readings, to the point that you may have noticed me annoyed at the difference readings on a recent stream. Most meters provide details about their accuracy in the paperwork, and they assert you can use certain calibration solution to verify that your particular device satisfied that calibration. Unfortunately when I have been looking at this in the past I couldn’t figure out whether there is an universal solution that could be used to compare the readings of the same exact concentration across different meters.

So, if I wasn’t using the Libre, what glucometer would I be using, and why? Most likely I’d still be using the Accu-Chek Mobile. As far as I am aware it’s still the only model of meter that uses a cartridge system rather than strips, and when going out for dinner or coffee, or being in the office, it’s nice to have the option to just check your blood sugar without having to worry about getting blood all over the place, or having to find where to throw the now-used strip. While the device is not the smallest glucometer out there, the carrying case does still make for a much more compact solution than most of the alternatives used, and the USB “thumbdrive” with data and graphs make it very easy to access with most devices. I have not tried the Bluetooth integration kit, though I did order one, but I guess that was needed for the increasing amount of people who do not use a “computer” daily, but does have access to a smartphone.

But this is just my choice, obviously. If you live in a country that does not provide to you the strips for free (or you don’t have an official diagnosis for which they would provide them for free), then the cost of the supplies is likely the main significant factor. Many of the manufacturers appear to have taken to the “razor and blades” approach of giving out the meters for free, or nearly free, but charging you (or your healthcare system) heavily for the strips. So it might be worth looking at the price of strips in your country to figure out on the long term what’s the cost of using a certain meter or another.

This is my best guess on why people appear to be finding my reviews of Chinese glucometers: to the best of my understanding there’s a number of countries, including Russia, where meters and strips are paid out of pocket, and so people turn to AliExpress, because there’s enough supply — most Chinese meters appear to use the same strips, and sellers undercut each other all the time, particularly when the strips are about to expire.

If price and availability are not an issue, and neither is the cartridge vs strips, then it continues down the road of features. Is the meter going to be used by an older person with eyesight issue? Look for a very big display. Is it going to be used by someone who has trouble with at a glance estimation of what is okay and isn’t (well noting that this category is pretty much transversal to age groups and education)? Look for a colour display that includes Green/Yellow/Red indicators, possibly one where the thresholds are configurable.

Some of the features also depend on what your doctors’ take on technology is. My diabetologist in Dublin didn’t have any diabetes management system for me to upload readings to, so I settled with running the exports with whichever software and sending it over as a PDF, while my new support team here in London uses Abbott’s LibreView. Am I completely comfortable turning to a cloud solution? No. But in the grand scheme of things it’s a tradeoff that works well for me, particularly after a year of Covid-19 pandemic, during which showing up at the hospital just to hand off my meter to be downloaded into the system would not have been a fun experience.

So if your medical team has set up a specific software for you to upload your data to, you probably want to choose a compatible meter. And that might mean either one that has PC connectivity so you can download it with a specific client, or one that has Bluetooth connectivity so that you can download it with your phone. With additional complications for macOS users and pretty much zero support for Linux users outside of devices supported by my glucometerutils, Tidepool or other similar solutions.

Different software also has different “analytics” of readings, with averages before and after meals and bucketed by time of day. Honestly, I don’t think I ever had enough useful information for a blood meter to build significant statistics out of it, but if that’s your cup of tea, that might be a good feature to choose your meter from (as long as you’re not using glucometerutils in which case just get any meter and build the analytics out of it).

Again depending heavily on who’s going to be using it, it’s important to take into consideration the physical size and a few of the practicalities of using a blood meter. Smaller meters work great if you have small hands, but they would be too fiddly to operate for someone with large, not nimbly hands. That’s why a lot of the models aimed at older people (with sound, large display, etc) are often designed to be big and with large and mushy buttons, rather than small and clicky. The same goes for strips: as I noted on the GlucoMen Areo review, Menarini did a very nice job with fairly large strips that are easier to handle, compared to, say, the tiny strips used by FreeStyle or OneTouch. But even with tiny strips, the meter can help with making it easier to handle; both of the Chinese meters I have reviewed have a lever to eject the strip directly into the trash, rather than having to take it out with your fingers — while I suspect this may just be cultural, it’s definitely a useful feature to have for those who are squeamish about handling blooded strips.

I would say that it’s pretty much impossible to have a meter fit all of the best characteristics, because a lot of those are subjective: I have nimble fingers, good numeracy, and a few reserves with sharing my data with unknown cloud providers, but with a medical team that does indeed use diabetes management systems. So if I had to be looking for a new meter (rather than the Libre) right now, I would probably be looking for a compact meter, that can be downloaded either with an application that exports directly to my doctors’, or with one that can generate a file I can email them, and the Accu-Chek still fits the bill: it does not have colourful display to tell me whether something is in range or not, and its buttons are clicky and not too wide, but it’s a tradeoff that works for me.

This should also probably explain why I talk about the stuff I talk about when I write my glucometers reviews: it’s all about how the device feels, what features it has, and how well it works to do what you want. Some of the models are more intuitive than others, and some have tradeoffs that don’t work for me, but I can see where they came from. I cannot compare the accuracy, since I don’t have the training to do so, but I can compare the rest of the features, and that’s what I focus on: it’s what most people will do anyway.

Rose Tinted Glasses: On Old Computers and Programming

The original version of this blog post was going to be significantly harder to digest and it actually was much more of a rant than a blog post. I decided to discard that, and try to focus on the positives, although please believe me when I say that I’m not particularly happy with what I see around me, and sometimes it takes strength not to add to the annoying amount of negativity out there.

In the second year of Coronavirus pandemic, I (and probably a lot more people) have turned to YouTube content more than ever, just to keep myself entertained in lieu of having actual office mates to talk with day in and day out. This meant, among other things, noticing a lot more the retrocomputing trend: a number of channels are either dedicated to talk about both games from the 80s and 90s and computers from the same era, or they seem to at least spend a significant amount of time on those. I’m clearly part of the target audience, having grown up with some of those games and systems, and now being in my 30s with disposable income, but it does make me wonder sometimes about how we are treating the nostalgia.

One of the things that I noted, and that actually does make me sad, is when I see some video insisting that old computers were better, or that people who used them were smarter because many (Commodore 64, Apple II, BBC Micro) only came with a BASIC interpreter, and you were incentivised to learn programming to do pretty much anything with them. I think that this thesis is myopic and lacks not just in empathy, but also in understanding of the world at large. Which is not to say that there couldn’t be good ways to learn from what worked in the past, and make sure the future is better.

A Bit Of Personal History

One of the things that is clearly apparent watching different YouTube channels is that there are chasms between different countries, when it comes to having computers available at an early age, particularly in schools. For instance, it seems like a lot of people in the USA have had access to a PET in elementary or junior high schools. In the UK instead the BBC Micro has been explicitly designed as a learning computer for kids, and clearly the ZX Spectrum became the symbol of an entire generation. I’m not sure how much bias there is in this storytelling — it’s well possible that for most people, all of these computers were not really within reach, and only a few expensive schools would have access to it.

In Italy, I have no idea what the situation was when I was growing up, outside of my own experience. What I can say is that until high school, I haven’t seen a computer in school. I know for sure that my elementary school didn’t have any computer, not just for the students, but also for the teachers and admins, and it was in that school that one of the teachers took my mother aside one day and told her to make me stop playing with computers because «they won’t have a future». In junior high, there definitely were computers for the admins, but no students was given access to anything. Indeed, I knew that one of the laboratories (that we barely ever saw, and really never used) had a Commodore (either 64 or 128) in it. This was the same years that I finally got my own PC at home: a Pentium 133MHz. You can see there is a bit of a difference in generations there.

Indeed, it might sound even strange that I even had a Commodore 64. As far as I know, I was the only one having it in my school: a couple of other kids had a family PC at home (which later I kind of did too), and a number of them had NES or Sega Master Systems, but the Commodore best years were long gone by the time I could read, so how did I end up with one? Well, as it turns out, not as a legacy from anyone older than me, which would be the obvious option.

My parents bought the Commodore 64 around the time I was seven, or at least that’s the best I can date it. It was, to the best of my knowledge, after my grandfather died, as I think he would have talked a bit more sense into my mother. Here’s a thing: my mother has had a quirk for encyclopaedias and other books collection, so when me and my sisters were growing up, the one thing we never missed was access to general knowledge. Whether it was a generalist encyclopedia with volumes dedicated to the world, history, and science, or a “kids’ encyclopedia” that pretty much only covers stuff aimed at preteens, or a science one that goes into details of the state of the art scientific thinking in the 80s.

So when a company selling a new encyclopedia, supposedly compiled and edited locally, called my parents up and offered a deal of 30 volumes, bound in a nice and green cover, and printed in full colour, together with a personal computer, they lapped it up fairly quickly. Well, my mother did mostly, my father was never someone for books, and couldn’t give a toss generally about computers.

Now, to be honest, I have fond memories of that encyclopedia, so it’s very possible that this was indeed one of the best purchases my parents undertook for me. Not only most of it was aimed at elementary-to-junior high ages, including a whole volume on learning grammar rules and two on math, but it also came with some volumes full to the brim of questionable computer knowledge.

In particular, the first one (Volume 16, I still remember the numbers) came with a lot of text describing computers, sometimes in details so silly that I still don’t understand how they put it together: it is here that I first read about core memory, for instance. It also went into long details about videogames of the time, including text and graphical adventures. I really think it would be an interesting read for me nowadays that I understand and know a lot more about computers and games at the time.

The second volume focused instead on programming in BASIC. Which would have been a nice connection to the Commodore 64 if it wasn’t that the described language was not the one used by the Commodore 64 in the first place, and it didn’t really go into details of how to use the hardware, with POKE and PEEK and the like. Instead it tried to describe some support for printers and graphics, that never worked on the computer I actually had. Even when my sister got a (second) computer, it came with GW-BASIC and it was also not compatible.

What the second volume did teach me, though, was something more subtle, which would take me many years to understand fully. And that is that programming is a mean to an end, for most people. The very first example of a program in the book, is a father-daughter exercise in writing a BASIC program to calculate the area of the floor of a room based on triangles and Heron’s Formula. This was a practical application, rather than teaching concepts first, and that may be the reason why I liked learning from that to begin with.

Now let me rant aside for a moment that the last time I wrote something about teaching, I ended up tuning out of some communities because I got tired of hearing someone complain that I cannot possibly have an opinion on teaching materials without having taught in academia. I have a feeling that this type of behaviour is connected with the hatred for academia that a number of us have. Just saying.

You may find it surprising that these random volumes of an encyclopedia my mother brought home when I could barely read would stay this long with me, but the truth is that I pretty much carried them along with me for many years. Indeed, they had two examples in the book that I nearly memorized, that were connected to each other. The first was a program that calculated the distance in days between two dates — explaining how the Gregorian calendar worked, including the rules for leap years around centuries. The second used this information to let you calculate a “biorhythm” that was sold as some ancient greek theory but was clearly just a bunch of “mumbo-jumbo” as Adam Savage would say.

The thing with this biorhythm idea, though, is that it’s relatively straightforward to implement: the way they describe it is that there’s three sinusoidal functions that set up three “characteristics” on different period lengths, so you calculate the “age in days” and apply a simple mathematical formula, et voilà! You have some personalised insight that is worth nothing but some people believe in. I can’t tell for sure if I ever really believed in those, or if I was just playing along like people do with horoscopes. (One day I’ll write my whole rant on why I expect people may find horoscope sign traits to be believable. That day is not today.)

So, having a basis of something to lay along with, I pretty much reimplemented this same idea over, and over, and over again. It became my “go to” hello world example, and with enough time it allowed me to learn a bit more of different systems. For example, when I got my Pentium 133 with Windows 95, and one of the Italian magazines made Visual Basic 5 CCE available, I reimplemented the same for that. When the same magazine eventually included a free license of Borland C++ Builder 1.0, as I was learning C++, I reimplemented it there. When I started moving to Linux more of the time and I wanted to write something, I did that.

I even got someone complaining that my application didn’t match the biorhythm calculated with some other app, and I had to find a diplomatic way to point out that there’s nothing scientific with either of thsoe and why should they even expect two apps to agree with it?

But now I’m digressing. The point I’m making is that I have, over the years, kept the lessons learned from those volumes with me, in different forms, and in different contexts. As I said, it wasn’t until a few years back that I realized that for most people, programming is not an art or a fun thing to do in their spare time, but it’s just a mean to an end. They don’t care how beautiful, free, or well designed a certain tool is, if the tool works. But it also means that knowing how to write some level of software gives empowers. It gives people power to build the tools they don’t have, or to modify what is already there but doesn’t quite work the way they want.

My wife trained as a finance admin, she used to be an office manager, and has some experience with CAFM software (Computer Aided Facilities Management). Most CAFM suites allow extensions in Python or JavaScript, to implement workflows that would otherwise be manual and repeating. This is the original reason she had to learn programming: even in her line of work, it is useful knowledge to have. It also comes with the effect of making it easier to understand spreadsheets and Excel — although I would say that there’s plenty of people who may be great at writing Python and C, but would be horrible Excel wranglers. Excel wrangling is its own set of skills and I submit to those who actually have them.

So Were Old Computers Better?

One of the often repeated lines is that old computers were better because either they were simpler to understand in one’s mind, or because they all provided a programming environment out of the box. Now, this is a particularly contentious point to me, because pretty much every Unix environment always had the same ability of providing a programming environment. But also, I think that the problem here is that there’s what I would call a “bundling of concerns”.

First of all, I definitely think that operating systems should come with programming and automation tools out of the box. But in fact that has (mostly) been the case since the time of Commodore 64 for me personally. On my sister’s computer, MS-DOS came with GW-BASIC first (4.01), and QBasic later (6.22). Windows 98 came with VBScript, and when I first got to Mac OS X it came with some ugly options, but some options nonetheless. The only operating system that didn’t have a programming environment for me was Windows 95, but as I said above, Visual Basic 5 CCE covered that need. It was even better with ActiveDesktop!

Now, as it turns out, even Microsoft appears to work to make it easier to code in Windows, with Visual Studio Code being free, Python being available in the Microsoft Store, and all those trimmings. So it’s hard to argue that there aren’t more opportunities to start programming now than there were in the early ’90s. What might be arguable is that nowadays you do not need to program to use a computer. You can use a computer perfectly fine without ever having learnt a programming language, and you don’t really need to know the difference between firmware and operating system, most of the time. The question becomes, whether you find this good, or bad.

And personally, I find it good. As I said, I find it natural that people are interested in using computers and software to do something, and not just for the experience of using a computer. In the same way I think most people would use a car to go to the places they need to go to, rather than just for the sake of driving a car. And in the same spirit of the car, there are people who enjoy the feeling of driving even when they don’t have a reason to, and there are people who find unnecessary things to be required when it comes to computers and technology.

I wish I found it surprising, but I just find it saddening that so many developers seem to be falling into the trap of thinking that just because they became creative by writing programs (or games, or whatever), the fact that computer users stopped having to learn programming means that they are less creative. John Scalzi clearly writes it better than me: there’s a lot of creativity in modern devices, even those that are attacked for being “passive consumption devices”. And a lot of that is not about programming in the first place.

What I definitely see is a pattern of repeating the behaviour of the generation that came before us, or maybe the one who came before them, I’m not sure. I see a number of parents (but thankfully by no mean all of them), insisting that since they learnt their trade and their programming a certain way, their kids should have the same level of tools available, no more and no less. It saddens me, even sometimes angers me, because it feels so similar to the way my own father kept telling me I was wasting my time inside, and wanted me to go and play soccer as he did in his youth.

This is certainly not only my experience, because I have talked and compared stories with quite a few people over the years, and there’s definitely a huge amount of geeks in particular who have been made fun of by their parents, and left scarred by that. And some of them are going to do the same to their kids, because they think their choice of hobbies is not as good as the ones we had in the good old days.

Listen, I said already in the past that I do not want to have children. Part of it has always been the fear of repeating the behaviour my father had with me. So of course I should not be the one to judge what others who do have kids do. But I do see a tendency from some, to rebuild the environment they grew up in, expecting that their kids would just pick up the same strange combination of geekiness they have.

At the same time I see a number of parents feeding the geekiness in their children with empowerment, giving them tools and where possible a leg up in life. Even this cold childfree heart warms up to see kids being encouraged to learn Scratch, or Minecraft.

What About All The Making, Then?

One of the constant refrains I hear is that older tools and apps were faster and more “creative”. I don’t think I have much in terms of qualifications to evaluate that. But I’m also thinking that for the longest time, creativity tools and apps were only free if you pirated them. This is obviously not to dismiss the importance of FLOSS solutions (otherwise why would I still be writing on the topic?) but the fact that a lot of the FLOSS solutions for creativity appear to have a similar spirit to the computers in the ’80s: build the tools you want to be creative.

I’m absolutely sure that there will be people arguing that you can totally be creative with Gimp and Inkscape. I also heard a lot more professionals laughing in the face of such suggestions, given the lack of important features that tools like that have had in comparison with proprietary software for many years. They are not bad programs per se, but they do find their audience in a niche compared to Photoshop, Illustrator, or Affinity Designer. And it’s not to say that FLOSS tools can’t become that good. I have heard the very same professionals who sneered (and still sneer) at Inkscape, point out how Krita (which has a completely different target audience) is a fascinating tool.

But when we look back at the ’90s, not even many FLOSS users would consider Gimp an useful photo-editing tool. If you didn’t have the money for the creativity, your option was most likely chosen between a pirate copy of Photoshop, or maybe if you’re lucky and an Italian magazine gifted it out, a license for Macromedia xRes 2.0. Or maybe FreeHand. Or Micrografx Windows Draw!.

The thing is, a lot of free-but-limited tools online are actually the first time that a wide range of people have finally been able to be creative. Without having to be “selected” as a friend of Unix systems. Without having to pirate software to be able to afford it, and without having to pony up a significant investment for something that they may not be able to make good use of. So I honestly welcome that, when it comes to creativity.

Again: the fact that someone cannot reason around code, or the way that Inkscape or Blender work, does not mean that they are less creative, or less skilled. If you can’t see how people using other tools are being just as creative, you’re probably missing a lot of points I’m making.

But What About The Bloated Web?

I’ve been arguing for less bloat in… pretty much everything, for the past 17 years on blogs and other venues. I wrote tools to optimize (even micro-optimize in some cases) programs and libraries so that they perform better on tiny systems. I have worked on Gentoo Linux, that pretty much allows you to turn off everything you can possibly turn off so you can build the minimalistic system you can think of. So I really don’t like bloat.

So is the web bloated? Yes, I’d say so. But not all of it is bloat, even when people complain about it. I see people suggesting that UTF-8 is bloat. That dynamic content is bloat. That emojis are bloat. Basically anything they don’t need directly is bloat.

So it’s clearly easy to see how your stereotypical 30-something US-born-and-raised, English-only-speaking “hacker” would think that an unstyled, white-on-black-background (or worse, green-on-black) website in ASCII would be the apotheosis of usable web. But that is definitely not what everyone would find perfect. People who speak languages needing more than ASCII exist, and are out there. Heck, people for whom the actual bloat from UTF-8 (vs UTF-16) is the wasteful optimization for ASCII are probably the majority of the world! People who cannot read on black backround exist, and they are even developers themselves at times (I’m one of them, which is why all my editors and terminals use light backgrounds, I get migraines from black backgrounds and dark themes).

Again, I’m not suggesting that everything is perfect and nothing needs to change. I’m actually suggesting that a lot needs to change, but it is not everything needs to change. So if you decide to tell me that Gmail is bloated and slow and use that as the only comparison to ’90s mail clients, I would point out to you that Gmail has tons of features that are meant for users not to shoot themselves in the feet, as well as being a lot more reliable than Microsoft Outlook Express or Eudora (which I know has lots of loyal followers, but I could never get behind myself), and also that there are alternatives.

Let me beat this dead horse a bit more. Over on Twitter when this topic came up, I was given the example of ICQ vs Microsoft Teams. Now the first thing is, I don’t use Teams. I know that Teams is an Electron app, and I know that most Electron app are annoyingly heavy and use a ton of resources. So, fair, I can live with calling them “bloated”. I can see why they chose this particular route, and disagree with it, but there is another important thing to note here: ICQ in 1998 is barely comparable with a tool like Teams, that is pretty much a corporate beast.

So instead, let’s try to compare something that is a bit more close: Telegram (which is already known I use — rather than talking about anything that I would have a conflict of interest on). How fast is Telegram to launch on my PC? It’s pretty much a single click to start and it takes less than a second on the beast that is my Gamestation. It also takes less than a second on my phone. How much did ICQ take to load? I don’t remember, but quite a lot longer because I remember seeing a splash screen. Which may as well have been timed to stay on the screen for a second or so because the product manager requested that, like it happened at one of my old jobs (true story!)

And in that, would ICQ provide the same features of Telegram? No, not really. First of all, it was just messages. Yes it’s still instant messaging and in that it didn’t really change much, but it didn’t have the whole “send and receive pictures” we have on modern chat applications, you ended up with having to do peer-to-peer transfers and good luck with that. It also had pretty much *no* server-side support for anything, at least when I started using it in 1998: your contact list was entirely client-side, and even the “authorization” to add someone to your friend list was a simple local check. There were plenty of ways to avoid these checks, too. Back in the day, I got in touch with a columnist from the Italian The Games Machine, Claudio Todeschini (who I’m still in touch with, but because life is strange and we met in person in a completely different situation many, many years later); the next time I re-installed my computer, having forgotten to back up ICQ data, I didn’t have him in my contacts anymore, and unsure on whether he would remember me, I actually used a cracked copy of ICQ to re-add him to my contacts.

Again, this was the norm back then. It was a more naive world, where we didn’t worry that much about harassment, we didn’t worry so much about SWATing, and everything was just, well, simpler. But that doesn’t mean it was good. It only meant that if you did worry about harassment, if someone was somehow trying to track you down, if the technician at your ISP was actually tapping your TCP sessions, they would be able to. ICQ was not encrypted for many years after I started using it, not even c2s, let alone e2e like Telegram secret chats (and other chat clients) are.

Someone joked about trying to compare running software on the same machine to see the performance fairly, but that is an absolute non-sequitur. Of course we use a lot more resources in absolute terms, compared to 1998! Back then I still had my Pentium 133MHz, with 48MiB of RAM (I upgraded!), a Creative 3D Blaster Banshee PCI (because no AGP slots, and the computer came with a Cirrus Logic that was notorious for not working well with Voodoo 2), and a Radio card (I really liked radio, ok?). Nowadays, my phone has a magnitude or two more resources, and you can find 8051s just as fast.

Old tech may be fascinating and easier to get into when it comes into learning how it all fits together, but the usable modern tech is meant to take trade offs toward the users more and more. That’s why we have UIs, that’s why we have touch inputs, that’s even why we have voice-controlled assistants, much as a number of tech enthusiasts appear to want to destroy them all.

Again, this feels like a number of people are yelling “Kids these days”, and repeating how “in their days” everything was better. But also, I fear there are a number of people who just don’t appreciate how a lot of the content you see on YouTube, particularly in the PC space of the ’90s and early ’00s, is not representative of what we experienced back then.

Let me shout out to two YouTubers that I find are doing it right: LGR and RetroSpector78. The former is very open to point out when he’s looking at a ludicrous build of some kind, that would never be affordable back in the day; the latter is always talking about what would be appropriate for the vintage and usage of a machine.

Just take all of the videos that use CF2IDE or SCSI2SD to replace “spinning rust” hard drives of yonder. This alone is such a speed boost on loading stuff that most people wouldn’t even imagine. If you were to try to load a program like Microsoft Works on a system that would be perfect for the time except for the storage, you would be experiencing a significant different loading time than it was back in the day.

And, by the way, I do explicitly mean Microsoft Works, not Office because, as Avery pointed out on Twitter, that was optimized for load speed — by starting a ton of processes early on, trading memory usage for startup speed. The reason why I say that is because, short of pirated copies of Office, most people in the ’90s that I know would be able to use at best Works, because it came pre-installed on their system.

So, What?

I like the retrocomputing trend, mostly. I love Foone’s threads, because one of the most important things he does is explain stuff. And I think that, if what you want is to learn how a computer works in detail, it’s definitely easier to do that with a relatively uncomplicated solution first, and build up to more modern systems. But at the same time, I think there is plenty of abstraction that don’t need to be explained if you don’t want to. This is the same reason why I don’t think that using C to teach programming and memory is a great idea: you need to know too much of details that are not actually meant to be understood for newcomers.

I also think that understanding the techniques used in both designing, and writing software for, constrained systems such as the computers we had in the ’80s and ’90s does add to the profession as a whole. Figuring out which trade off was and was not possible at the time is one step, finding and possibly addressing some of the bugs is another. And finally there is the point we’re getting to a lot lately: we can now build replacement components with tools that are open to everyone!

And you know what? I do miss some of the constrained systems, because I have personal nostalgia for them. I did get myself a Commodore 64 a couple of years ago, and I loved the fact that, in 2021, I can get the stuff I could have never afforded (or even didn’t exist) back when I was using it: fast loaders, SD2IEC, a power supply that wouldn’t be useful as a bludgeoning instrument, and a SCART cable to a nice and sharp image, rather than the fuzzy one when using the RF input I had to.

I have been toying with the idea of trying to build some constrained systems myself. I think it’s a nice stretch for something I can do, but with the clear note that it’s mostly art, and not something that is meant to be consumed widely. It’s like Birch Books to me.

And finally, if you only take a single thing away from this post, is that you should always remember that an usable “bloated” option will always win over a slim option that nobody but a small niche of people can use.

After The Streams — Conclusion From My Pawsome Players Experiment

A few weeks ago I announced my intention to take part in the Cats Protection fundraiser Pawsome Players. I followed through with seven daily streams on Twitch (which you can find archived on YouTube). I thought I would at least write some words about the experience, and to draw some lines out of what worked, and what didn’t, and what to expect in the future.

But before I drop into dissecting the stream, I wanted to thank those who joined me and donated. We reached £290 worth of donations for Cats Protection, which is no small feat. Thank you, all!

Motivations

There’s two separate motivations to look at when talking about this. There’s my motivation for having a fundraiser for Cats Protection, and then the motivations for me doing streams at all, and those needs to be separated right away.

For what concern the choice of charity – both me and my wife love cats and kittens, we’re childfree cat people. The week happened to culminate in my wife’s birthday and so in a way it was part of my present for her. In addition to that, I’m honestly scared for the kittens that were adopted at the beginning of the lockdown and might now be left abandoned as the lockdown eases.

While adopting a kitten is an awesome thing for humans to do, it is also a commitment. I am afraid for those who might not be able to take this commitment to full heart, and might find themselves abandoning their furry family member once travel results and they are no longer stuck at home for months on end.

I also think that Cats Protection, like most (though not all) software non-profit organizations, are perfectly reasonable charities to receive disposable funds. Not to diminish the importance and effort of fundraisers and donations to bigger, important causes, but it does raise my eyebrow when I see that NHS needs charitable contributions to be funded — that’s a task that I expect the government taking my tax money should be looking at!

Then there’s the motivation for me doing livestreams at all — it’s not like I’m a particularly entertaining host or that I have ever considered a career in entertainment. But 2020 was weird, particularly when changing employer, and it became significantly more important to be able to communicate across a microphone, a camera and a screen the type of information I would usually have communicated in a meeting room with a large whiteboard and a few colour markers. So I have started looking at way to convey more information that don’t otherwise fit written form, because it’s either extemporaneous, or require a visual feedback.

When I decided to try the first livestream I actually used a real whiteboard, and then I tried this with Microsoft’s Whiteboard. I have also considered the idea of going for a more complex video production by recording a presentation, but I was actually hoping for a more interactive session with Q&A and comments. Unfortunately, it looks only a few people ever appeared in the chatrooms, and most of the time they were people who I am already in contact with outside of the streams.

What I explicitly don’t care for, in these streams, is to become a “professional” streamer. This might have been different many years ago — after all, this very blog was for a long time my main claim to fame, and I have been doing a lot of work behind the scenes to make sure that it would give a positive impression to people, and it involved also quite a bit of investment not just in time but in money, too.

There’s a number of things that I know already I would be doing differently if I was trying to make FLOSS development streaming a bigger part of my image — starting with either setting up or hiring a multiplicator service that would stream the same content onto more than just Twitch. Some of those would definitely be easier to pull off nowadays with a full-time job (cash in hand helps), but they would be eating into my family life to a degree I’m no longer finding acceptable.

I will probably do more livestreams in the upcoming months. I think there’s a lot of space for me to grow when it comes to provide information in a live stream. But why would I want to? Well, the reason is similar to the reason why this blog still exists: I have a lot of things to say — not just in the matter of reminding myself how to do things I want to do, but also a trove of experience collected vastly by making mistakes and slamming head-first into walls repeatedly – and into rakes, many many rakes – which I enjoy sharing with the wider world.

Finally (and I know I said there’s two motivation), there’s a subtlety: when working on something while streaming, I’m focusing on the task at hand. Since people are figuratively looking over my shoulder, I don’t get to check on chats (and Twitter, Facebook, NewsBlur), I don’t get to watch a YouTube video in the background and get distracted by something, and I don’t get to just look at shopping websites. Which means that I can get to complete some open source hacking, at least timeboxed for the stream.

Tangible Results

Looking back at what I proposed I’d be doing, and what I really ended up doing, I can’t say I’m particularly happy about the results. It took me significantly longer to do some things that I expected would take me no time whatsoever, and I didn’t end up doing any of the things I meant to be doing with my electronics project. But on the other hand, I did manage some results.

Beside the already noted £290 collected for Cats Protection (again, thank you all, and in particular Luke!), I fully completed the reverse engineering of the GlucoMen areo glucometer that I reviewed last week. I think about an hour of the stream was dedicated to me just poking around trying to figure out what checksum algorithm it used (answer: CRC-8-Maxim as used in 1wire) — together with the other streams and some offline work, I would say that it took about six hours to completely reverse engineer that meters into an usable glucometerutils driver, which is not a terrible result after all.

What about unpaper? I faffed around a bit to get the last few bits of Meson working — and then I took on a fight with Travis CI which resulted in me just replacing the whole thing with GitHub Actions (and incidentally correcting the Meson docs). I think this is also a good result to a point, but I need to spend more time before I make a new release that uses non-deprecated ffmpeg APIs — or hope that one of my former project-mates feel for me and helps.

Tests are there, but they are less than optimal. And I only scratched the surface of what could be integrated into Meson. I think that if I sat down with the folks who knows the internal in a chat I might be able to draw some ideas that could help not just me but others… but it turns out that involves me spending time in chat rooms, and it’s not something that can be focused on a specific time slot a week. I guess that is one use where mailing lists are still a good approach, although that’s no longer that common after all. GitHub issues, pull requests and projects might be a better approach, but the signal-to-noise ratio is too slow in many cases, particularly when half the comments are either pile-on or “Hey can you get to work on this?”. I don’t have a good answer for this.

The Home Assistant stream ended up being a total mess. Okay, on one half of it I managed to sync (and subsequently get merged) the pull requests to support bound CGG1 sensors into ESPHome. But when I tried to set up the custom component to be released I realized that first, I have no idea how to make a Home Assistant custom component repository – there’s a few guidelines if you plan to get your component into HACS (but I wasn’t planning to), and the rest of the docs suggest you may want to submit it to inclusion (which I cannot do because it’s a web scraper!) – and the second is that the REUSE tool is broken on Windows, despite my best efforts last year to spread its usage.

The funny thing is that it appears to be broken because it started depending on python-debian, which mostly reasonably didn’t expect to have to support non-Unix systems, and thus imported the pwd module unconditionally. The problem is already fixed on their upstream repository, but there hasn’t been a release of the package in four months and so the problem is still there.

So I guess the only thing that worked well enough throughout this is that I can reverse engineer devices in public. And I’m not particularly good at explaining that, but I guess it’s something I can keep doing. Unfortunately it’s getting harder to find devices that are not either already well covered, or otherwise resistant to the type of passive reverse engineering I’m an expert in. If you happen to have some that you think might be a worthy puzzle, I’m all ears.

Production and Support

I have not paid attention too much about production. Except for the one thing: I got myself a decent microphone because I heard my voice in one of the previous streams and I cringed. Having worked too many years in real-time audio and video streaming, I’m peculiar about things like that.

Prices of decent microphones, often referred to as “podcasting” microphones when you look around, skyrocketed during the first lockdown and don’t appear to have come quite down yet. You can find what I call “AliExpress special” USB microphones that look fancy studio mics on Amazon at affordable prices, but they pretty much only look the part, not being comparable in terms of specs — might be just as tinny as your average webcam mic.

If you look at “good” known brands, you usually find them in two configurations: “ready to use” USB microphones, and XLR microphones — the latter being the choice of more “professional” environments, but not (usually) directly connected to a computer… but there’s definitely a wide market of USB capture cards and they are not that much more expensive when adding it all together. The best thing about the “discrete” setup (with an XLR microphone and an USB capture card/soundcard) is that you can replace them separately, or even combine more of them at a lower cost.

In my case, I already owned a Subzero SZ-MIX06USB mixer with USB connection. I bought it last year to be able to bring in the sound from the ~two~ three computers in my office (Gamestation, dayjob workstation, and NUC) into the same set of speakers, and it comes with two XLR inputs. So, yes, it turned out that XLR was a better choice for me then. The other nice thing about using a mixer here, is that I can control some of the levels on the analog side — because I have a personal dislike of too-low frequencies, so I have done a bit of tweaking of the capture to suit my own taste. I told you I’m weird when it comes to streaming.

Also let’s me be clear: unless you’re doing it (semi-)professionally, I would say that investing more than £60 would be a terrible idea. I got the microphone not only to use for the livestream, but also to take a few of the meetings (those that don’t go through the Portal), and I already had the mixer/capture card. And even then I was a bit annoyed by the general price situation.

It also would have helped immensely if I didn’t have an extremely squeaky chair. To be honest, now that I know it’s there, I find it unnerving. Unfortunately just adding WD40 from below didn’t help — most of the videos and suggestions I found on how to handle the squeaks of this model (it’s an Ikea Markus chair — it’s very common) require unscrewing most of the body to get to the “gearbox” under the seat. I guess that’s going to be one of the tasks I need to handle soon — and it’s probably worth it given that this chair already went through two moves!

So hardware aside, how’s the situation with the software? Unfortunately, feng is no longer useful for this. And as I was going through options last year I ended up going for Streamlabs OBS for the “It vastly works out of the box” option. Honestly, I should probably replace it with OBS Studio, since I’m not using any of Streamlabs’ features, and I might as well stick to the original source.

As I said above, I’m not planning to take on streaming as a professional image — if I did, I probably would have also invested in licensing some background music or a “opening theme”. And I probably would have set up the stream backgrounds differently — right now I’m just changing the background pictures based on what I shot myself.

Conclusions

It was a neat experiment — but I don’t think I’ll do this again, at least not in this form.

Among other things, I think that doing one hour of stream is sub-optimal — it takes so long to set up and remind people about the chat and donations, and by the time I finished providing context, I was already a quarter of the hour in. I think two to three hours is a better time — I would probably go for three hours with breaks (which would have been easier during the Pawsome Players events, since I could have used the provided videos to take breaks).

Overall, I think that for this to work it needs a bigger, wider audience. If I was in the same professional space I was ten years ago, with today’s situation, I would probably be having all kind of Patreon subscriptions, with the blog being syndicated on Planet Gentoo, and me actually involved in a project… then I think it would made perfect sense. But given it’s “2021 me” moving in “2021 world”… I doubt there’s enough people out there who care about what goes through my mind.

Glucometer Review: GlucoMen Areo

Two weeks ago I reviewed a GlucoRx Q meter, and while I was doing that I ended up down a rabbithole that I did not expect. The GlucoRx Nexus and Q are both manufactured by the same company, TaiDoc Technology – a Taiwanese medical device manufacturer that manufacturers and sells both personal medical devices such as glucometers and hospital equipment. They clearly manufacture “white label” meters, given that the Nexus (TD-4277) is available under a number of different brands and names — and in particular when looking at that I found it as the “GlucoMen Nexus”.

The reason why that caught my eye is that it’s marketed by the Italian pharmaceutical company Menarini — and I’m Italian so I knew the name. So when I added the information on the Q, I thought I would go and look on whether they also sold that under the GlucoMen brand — they didn’t, but I found another rathole to look at.

It turns out that the GlucoMen brand in the UK is managed by a subsidiary of Menarini called A. Menarini Diagnostics, and they sell not just your usual blood-based glucometers, but also a CGM system (though from what I can see it’s similar to the Dexcom I didn’t like). They also allowed me to order a free meter (the GlucoMen Areo that I’m going to review here), together with the free USB cable to use with it to download the data to a computer.

The fact that this meter required a separate cable hinted me at the fact that it’s not just another TaiDoc rebrand — as I said in the previous post, TaiDoc is the first manufacturer that I find re-using their exact protocol across different devices, and that suggested me that any other modern model from them would also use the same, and since they are using a Silicon Labs 8051 platform with native USB, it sounded unlikely they would need a separate cable.

Indeed, when the meter arrived it was clear that it’s not a TaiDoc device — it’s very different and all the markings on it suggest that Menarini is manufacturing it themselves (though, also in Taiwan). And it’s… interesting.

The first thing that I noted is that the carrying pouch was significantly higher quality than I’m used to. No netting to hold stuff in, but instead a Velcro-held pouch, and enough space to hold their prickling pen with. And on the inside, in addition to the logo of the company, space to write (or attach a label of) name, phone number, address and email. This is the first time in all my years with diabetes that I see such a “posh” pouch. Which turned out not to be entirely too surprising once I noticed that the pouch has the logo of the Italian accessories designer/manufacturer Tucano.

Going instead to look at the meter, what came to my attention quickly is that this meter is the first non-Chinese meter I find that (allegedly) has a “touch free” ejection of the testing strips. The shape and construction of the device roughly reminds me of the basic Tamagotchi from my early teens — to the point that when I push the button to eject the strip I’m afraid I’m going to destroy the LCD panel in the front. Note that unlike the Chinese devices, that have a lever that physically push the strip out of the holder, in this meter the “Push” button only appears to “let go” of the strip, which you can tip into the trash, but does not physically dislodge it at all.

The cable sent to me is another of the common 2.5mm TRS serial adapters — this one using a Silicon Labs CP2104-compatible chip on board (probably in the mould of the USB plug). It plugs at the bottom of the device, in a fashion pretty much identical to the OneTouch Ultra2 devices. Not surprising given I think they were the most common in Italy back in the day.

In addition to the USB/serial connectivity, the device is meant to speak NFC to a phone. I have not figured out how that is supposed to work, to be honest. It seems to be meant mostly to integrate with their CGM solution, and since I don’t have one (and I’m not interested in testing it right now), I don’t expect I’ll figure that out any time soon. Also NFC snooping is not my cup of tea and I’ll gladly leave that to someone else who actually knows how to do that.

Aesthetics continue with the design of the testing strips, that are significantly larger than most other meters I have at hand right now (this is not always a bad thing — particularly for older people, larger strips are easier to use), with a very thin drop placement at the top, and a white front. I’m not trying to play the stereotype of “Italian company cares about style more than substance”, but I have seen enough glucometers by now to say that Menarini definitely had a designer go through this with an eye at fitting everything together.

In terms of usability, the device is pretty straightforward — the display is a standard LCD (so standard I’m sure I saw the same exact segments before), very bright and easily readable. The amount of blood needed in the strip is actually much smaller than you would expect, but also not as little as other meters I have used in the past. This became very apparent during the last of my reverse engineering streams, when I lost three (or four) strips to “Err3” (too little blood), and it reminded me of how many strips I used to lose to not having drawn enough blood when I started using a meter.

In terms of documentation and usability of the markers function there’s something to say there, too. The device supports one or none markers out of four: before meal, after meal, exercise and “check mark” — the check mark is documented in the manual as being pretty much a freeform option. The way you mark these is by pressing (not holding) the power button when you’re looking at the strip result — the manual says to hold the button until the marker flashes, but if you hold it for more than a split second it actually turns off the device, which is not what I expected.

In a strange turn of events, this is also the first meter I saw using a fish (and fish bones) to represent the before (and after) meal. Nearly everything else I have at hand uses an apple (and an apple core), since that’s fruit, and thus sugars. I don’t have an issue on the option per se, but I can imagine this does confuse people at times.

The software is… painfully complex. It seems designed more for medical professionals than home use, which probably explains why the cable is not included by default. It also supports most GlucoMen devices, though it appears to install a long list of drivers for USB-to-serial adapters, which suggests each cable comes with a different serial adapter.

I have actually fully reverse engineered the protocol, live on stream during my Cats Protection Pawsome Players week — you can see the live streams archived on YouTube, but I’ll also post (probably later on) a summary of my discovery. It’s fully supported now on glucometerutils tough. The interesting part there is that I found how the original software has a bug: it can only set the time some of the times, literally, because of the way it provides the checksum to the device. My implementation doesn’t suffer from that bug.

Testing is Like Onions: They Have Layers, And They Make You Cry

As a Free Software developer, and one that has worked in a number of separate projects as well as as totally different lines of work, I find myself having nuance and varied opinions on a bunch of topics, which sometimes don’t quite fit into the “common knowledge” shared on videos, blog posts or even university courses.

One such opinion relates to testing software in general. I have written lots about it, and I have ranted about it more recently as I was investigating a crash in unpaper live on-stream. Testing is undoubtedly one of the most useful techniques for developers and software engineers to build “properly solid” software. It’s also a technique that, despite a lot of writing about it, I find is nearly impossible to properly teach without first hand experience.

I want to start this by staying that I don’t believe there is an universal truth about testing. I don’t think I know everything there is to know about testing, and I speak almost exclusively from experience — experience that I acquired in now over ten years working in different spaces within the same industry, in sometimes less than optimal ways, and that has convinced me at times that I held the Truth (with capital T), just to crush my expectations a few months later.

So the first thing that I want you all to know, if you intend on starting down the path of caring more about testing, is to be flexible. Unless your job is literally responsible for someone’s life (medical, safety, self-driving), testing is not a goal in and by itself. It rather is a mean to an end: building something to be reliable. If you’re working on corporate project, your employer is much less likely to care that your code is formally verifiable, and more likely to care that your software is as bug-free as possible so that they can reap the benefits of ongoing revenue without incurring into maintenance costs.

An aside here: I have heard a few too many times people “joking” about the fact that proprietary, commercial software developers introduce bugs intentionally so that they can sell you an update. I don’t believe this is the case, not just because I worked for at least a couple of those, but most importantly because a software that doesn’t include bugs generally make them more money. It’s easier to sell new features (or a re-skinned UI) — or sometimes not even that, but just keep changing the name of the software.

In the Free Software world, testing and correctness are often praised, and since you don’t have to deal with product managers and products overall, it sounds like this shouldn’t be an issue — but the kernel of truth there is that there’s still a tradeoff to be had. If you take tests as a dogmatic “they need to be there and they need to be complete”, then you will eventually end up with a very well tested codebase that is too slow to change when the environment around it changes. Or maybe you’ll end up with maintainers that are too tired to deal with it at all. Or maybe you’ll self-select for developers who think that any problem caused by the software is actually a mistake in the way it’s used, since the tests wouldn’t lie. Again, this is not a certainty, but it’s a chance it can happen.

With this in mind, let me go down the route of explaining what I find important in testing overall.

Premise and preambles

I’m going to describe what I refer to as the layers of testing. Before I do that, I want you to understand the premise of layering tests. As I said above, my point of view is that testing is a technique to build safe, reliable systems. But, whether you consider it in salary (and thus hard cash) in businesses or time (thus “indirect” cash) in FLOSS projects, testing has a cost, and nobody really wants to build something safely in an expensive way, unless they’re doing it for fun or for the art.

Since performative software engineering is not my cup of tea, and my experience is almost exclusively in “industry” (rather than “academic”) setting, I’m going to ignore the case where you want to spend as much time as possible to do something for the sake of doing something, and instead expect that if you’re reading further, you’re interested in the underlying assumption that any technique that helps is meant to help you produce something “more cheaply” — that is the same premise as most Computer-Aided Software Engineering tools out there.

Some of the costs I’m about to talk about are priced in hard cash, other are a bit more vacuous — this is particularly the case at the two extremes of the scale: small amateur FLOSS projects rarely end up paying for tools or services (particularly when they are proprietary), so they don’t have a budget to worry about. In a similar fashion, when you’re working for a huge multinational corporation that literally design their own servers, it’s unlikely that testing end up having a visible monetary cost to the engineers. So I’ll try to explain, but you might find that the metrics I’m describing make no sense to you. If so, I apologize, and might try harder next time, feel free to let me know in a comment.

I’m adding another assumption here: testing is a technique that allows changes to be shipped safely. We want to ship faster, because time is money, and we want to do it while wasting as little resources as possible. These are going to be keywords I’m going to refer back to a few times, and I’m choosing them carefully — my current and former colleagues are probably understanding well how these fit together, but none of these are specific of an environment.

Changes might take a lot of different forms: it might be a change to the code of an application (patch, diff, changelist, …) that needs to be integrated (submitted, merged, landed, …), or it might be a new build of an application, with a new compiler, or new settings, or new dependencies, or it might be a change in the environment of the application. Because of this, shipping also takes a lot of different shapes: you may use it to refer of publishing your change to your own branch of a repository, to the main repository, to a source release, or directly to users.

Speed is also relative, because it depends on what the change is about and what to we mean with shipping. If you’re talking about the time it take you to publish your proposed change, you wouldn’t want to consider a couple of days as a valid answer — but if you’re talking about delivering a new firmware version to all of your users, you may accept even a week’s delay as long as it’s done safely. And that goes similar to cost (since it’s sometimes the same as time): you wouldn’t consider hiring a QA person to test each patch you write for a week — but it makes more sense if you have a whole new version of a complex application.

Stages and Layers

Testing has layers, like onions and orcs, and that these layers are a direct result of the number of different definitions we can attach to the same set of words, in my experience. A rough way to look at it is to consider the (rough) stages that are involved in most complex software projects: someone makes a change to the source code, someone else reviews it, it gets integrated into the project’s source code, then a person that might be one of the two already involved decides to call for a new release cut, and they eventually deliver it to their users. At each of these stages, there’s testing involved, and it’s always slightly different, both in terms of what it does, and what the tradeoffs that are considered acceptable.

I just want to publish my patch!

The first, innermost layer, I think of when it comes to testing is the testing involved in me being able to publish my change — sometimes also referred to as sending it for review. Code review is another useful technique if used well, but I would posit it’s only useful if it focuses on discussing approaches, techniques, design, and so on – rather than style and nitpicks – which also means I would want to be able to send changes for discussion early: the cost of rejecting a sub-optimal change, or at least requesting further edits to it, is proportional to the amount of time you need to spend to get the change out for review.

So what you want at this stage is fast, cheap tests that don’t require specific resources to be ran. This is the place of type-checking tools, linters, and pure, limited unit tests: tests that take a specific input, and expect the output to be either always the same or within well-established parameters. This is also where my first stone in the shoe needs to drop.

The term “change-detector test” is not widely used in public discourse, but it was a handy shorthand in my previous bubble. It refers to tests written in a way that is so tightly coupled with the original function, that you cannot change the original function (even maintaining the API contract) without changing the test. These are an antipattern for most cases — there’s a few cases in which you _really_ want to make sure that if you change anything in the implementation, you go and change the test and explicitly state that you’re okay with changing the measured approach, such as if you mean to have a constant-time calculation.

There are also the all-mocks tests — I have seen these in Python for the most part, but they are not exclusive to it, since any language that has easy mocking and patching facilities can lead to this outcome — and for languages that lack those, overactive dependency injection can give similar results. These tests are set up in such a way that, no matter what the implementation of the interface under test is, it’s going to return you exactly what you set up in the mocks. They are, in my experience, a general waste of time, because they add nothing over not testing the function at all.

So why are people even writing these types of tests? Well, let me be a bit blasphemous here, and call out one of the reasons I have seen used to justify this setup: coverage metrics. Coverage metrics are a way to evaluate whether tests have been written that “cover” the whole of the program. The concept is designed so that you strive to exercise all of the conditional parts of your software during testing, so the goal is to have 100% of the source code “covered”.

Unfortunately, while the concept is a great idea, the execution is often dogmatic, with a straight ratio of expected coverage for every source file. The “incremental coverage” metric is a similar concept that suggests that you don’t want to ever reduce the coverage of tests. Again, a very useful metric to get an idea if the changes are unintentionally losing coverage, but not something that I would consider giving a strict order to.

This is not to mean that coverage metrics are not useful, or that it’s okay to not exercise parts of a program through the testing cycle — I just think that coverage metrics in the innermost layer are disingenuous and sometimes actively harmful, by introducing all-mocks and change-detector tests. I’ll get to where I think they are useful later.

Ideally, I would say that you don’t want this layer of tests to take more than a couple of minutes, with five being on the very high margin. Again, this falls back on the cost of asking changes — if going back to make a “trivial” change would require another round of tests consuming half an hour, there’s an increase chance that the would insist on making that change later, when they’ll be making some other change instead.

As I said earlier, there’s also matters of trade-offs. If the unit testing is such that it doesn’t require particular resources, and can run relatively quickly through some automated system, the cost to the author is reduced, so that a longer runtime is compensated by not having to remember to run the tests and report the results.

Looks Good To Me — Make sure it doesn’t break anything

There is a second layer of testing that fits on top of the first one, once the change is reviewed and approved, ready to be merged or landed. Since ideally your change does not have defects and you want to just make sure of it, you are going to be running this layer of testing once per change you want to apply.

In case of a number of related changes, it’s not uncommon to run this test once per “bundle” (stack, patchset, … terminology changes all the time), so that you only care that the whole stack works together — although I wouldn’t recommend it. Running one more layer of test on top of the changes make it easier to ensure they are independent enough that one of them can be reverted (rolled back, unlanded) safely (or at least a bit more safely).

This layer of tests is what is often called “integration” testing, although that term is still too ambiguous to me. At this layer, I would be caring to make sure that the module I’m changing still exposes an interface and a behaviour consistent with the expectation from the consumer modules, and still consumes data as provided by its upstream interfaces. Here I would avoid mocks unless strictly required, and rather prefer “fakes” — with the caveat that sometimes you want to use the same patching techniques as used with mocks, particularly if your interface is not well suited for dependency injection.

As long as these tests are made asynchronous and reliable, they can take much longer than the pre-review unit tests — I have experience environments in which the testing after approval and before landing take over half hour, and it’s not that frustrating… as long as they don’t fail for reasons outside of your control. This usually comes down to handling being able to have confidence in sequencing solutions and the results of the tests — nothing is more frustrating than waiting for two hours to land a change just to be told “Sorry, someone else landed another change in the meantime that affects the same tests, you need to restart your run.”

Since the tests take longer, this layer has more leeway in what it can exercise. I personally would strictly consider network dependencies off-limits: as I said above you want to have the confidence in the result, and you don’t want that your change failed to merge because someone was running an update on the network service you rely upon, dropping your session.

So instead, you look for fakes that can implement just enough of the interaction to provide you with signal while still being under your control. To make an example, consider an interface that takes some input, processes it and then serializes some data into a networked datastore: the first layer unit test would focus on making sure that the input processing is correct, and that the resulting structure contains the expected data given a certain input; this second layer of tests would instead ask to serialize the structure and write it to the datastore… except that instead of the real datastore dependency, you mock or inject a fake one.

Depending on the project and the environment, this may be easier said than done, of course. In big enterprises it isn’t unexpected for a team providing a networked service to also maintain a fake implementation of it. Or at least maintain an abstraction that can be used both with the real distributed implementation, and with a local, minimal version. In the case of a datastore, it would depends on how it’s implemented in the first place: if it’s a distributed filesystem, its interface might just be suitable to use both with the network path and with a local temporary path; if it’s a SQL database, it might have an alternative interface using SQLite.

For FLOSS projects this is… not always an easy option. And this gets even worse when dealing with hardware. For my glucometerutils project, I wouldn’t be able to use fake meters — they are devices that I’m accessing, after all, without the blessing of their original manufacturer. On the other hand, if one of them was interested in having good support for their device they could provide a fake, software implementation of it, that the tool can send commands to and explore the results of.

This layer can then verify that your code is not just working, but it’s working with the established interfaces of its upstreams. And here is where I think coverage metrics are more useful. You no longer need to mock all the error conditions upstream is going to give you for invalid input — you can provide that invalid input and make sure that the error handling is actually covered in your tests.

Because the world is made of trade offs, there’s more trade offs to be made here. While it’s possible to run this layer of tests for a longer time than the inner layer, it’s still often not a good idea to run every possible affected test, particularly when working in a giant monorepo, and on core libraries. In these situations an often used trade off has most changes going through a subset of tests – declared as part of the component being changed – with the optional execution of every affected test. It relies on manually curated test selection, as well as a comprehensive dependency tracking, but I can attest that it scales significantly better than running every possibly affected test all the time.

Did we all play well together?

One layer up, and this is what I call Integration Testing. In this layer, different components can (and should) be tested together. This usually means that instead of using fakes, you’re involving networked services, and… well, you may actually have flakes if you are not resilient to network issues.

Integration testing is not just about testing your application, but it’s also testing that the environment around it works along with it. This brings up an interesting set of problems when it comes to ownership. Who owns the testing? Well, in most FLOSS projects the answer is that the maintainers of a project own the testing of their project, and their project only. Most projects don’t really go out of their way to try to and figure out if the changes to their main branch cause issues to their consumers, although a few, at least when they are aware that the changes may break downstream consumers, might give it a good thought.

In bigger organizations, this is where things become political, particularly when monorepos are involved — that’s because it’s not unreasonable for downstream users to always run their integration tests against the latest available version of the upstream service, which is more likely to bump into changes and bugs of the upstream service than the system under actual test (at least after the first generation of bugs and inconsistencies is flattened out).

As you probably noticed by now, going up the layers also means going up in cost and time. Running an integration test with actual backends is no exception to this. You also introduce a flakiness trade-off — you could have an integration test that is always completely independent between runs, but to do so you may need to wait for a full bring-up of a test environment at each run; or you could accept some level of flakes, and just reuse a single test environment setup. Again, this is a matter of trade-offs.

The main trade-off to be aware of is the frequency of certain type of mistakes over others. The fastest tests (which in Python I’d say should be type checking rather than “actual testing”) should be covering mainly the easy-to-make mistakes (e.g. bytes vs str), while the first layer of testing should cover the interfaces that are the easiest to get wrong. Each layer of tests take more time and more resources than the one below, and so it should be run less often — you don’t want to run the full integration tests on drafts, but also you may not be able to afford running it on each submitted change — so maybe you batch changes to test, and reduce the scope of the failure within a few dozens.

But what it if it does fail, and you don’t know which one of the dozen broke it? Well, that’s something you need to get an answer for yourself — in my experience, what makes it easy at this point is not allowing further code changes to be landed until the culprit change is found, and only using revisions that did pass integration testing as valid “cutting points” for releases. And if your batch is small enough, it’s much faster to have a bisection search between the previous run and the current.

If It Builds, Ship It!

At this point, you may think that testing is done: the code is submitted, it passed integration testing, and you’re ready to build a release — which may again consists on widely different actions: tag the repository, build a tarball of sources, build an executable binary, build a Docker image, …

But whatever comes here, there’s a phase that I will refer to as qualifying a release (or cut, or tag, or whatever else). And in a similar fashion as to what I did in Gentoo, it’s not just a matter to make sure that it builds (although that’s part of it, and that by itself should be part of the integration tests), it also needs to be tested.

From my experience here, the biggest risk at this stage is to make sure that the “release mode” of an application works just as well as the “test mode”. This is particularly the case with C and other similar languages in which optimizations can lead to significantly different code being executed than in non-optimized code — this is, after all, how I had to re-work unpaper tests. But it might also be that the environments used to build the integration testing and the final releases are different, and because of that the results are different with that.

Again, this will take longer — although this time it’s likely that the balance of time spent would be on the build side rather than the execution time: optimizing a complex piece of software into a final released binary can be intensive. This is the reason why I would expect that test and release environments wouldn’t be quite the same, and the reason why you need a separate round of testing when you “cut” a release somehow.

Rollin’

That’s not the last round of “testing” that is involved in a full, end-to-end, testing view: when a release is cut, it needs to be deployed – rolled out, published, … – and that in general needs some verification. That’s because even though all of the tests might have passed perfectly fine, they never hit their actual place in a production environment.

This might sound biased towards distributed systems, such as cloud offerings and big organizations like my current and previous employers, but you have the same in a number of smaller environments too: you may have tested something in the staging environment as part of release testing, but are you absolutely certain that the databases running the production environment are not ever so slightly different? Maybe it’s a different user that typed in the schema creation queries, or maybe the hostname scheme between the two is such that there’s an unexpected character in the latter that crashes your application at startup.

This layer of testing is often referred to as healthchecks, but the term has some baggage so I wouldn’t stay too attached to it. In either case, while often these are not considered tests per-se, but rather part of monitoring, I still consider them part of the testing layers. That is also because, if a system is sufficiently complex and critical, you may implement them exactly as part of testing, by feeding it a number of expected requests and observe the results.

Final Thoughts

Testing is a complicated matter, and I’m not promising I gave you any absolute truth that will change your life or your professional point of view. But I hope this idea of “layering” testing, and understanding that different interactions can only be tested at different layers, will give you something to go by.

Glucometer Notes: GlucoRx Q

This article is going to be spending some time to talk about the meter, the manufacturer, and my reverse engineering. The more “proper” review of the device will be at the end, look for Review as title.

So despite having had no progress in months with my reverse engineering efforts started last year, I have not been ignoring my pastime of acquiring and reverse engineering the protocols of glucometers. And since I felt a bit bored, I went onto AliExpress Amazon UK and searched for something new to look at. Unfortunately, as usual, Amazon is becoming more and more a front for drop-ship sellers from AliExpress and similar sources, so most of the results are Sinocare. Eventually, I found the a GlucoRx Q, which looked interesting, given that I have already reverse engineered the GlucoRx Nexus.

Let’s start with a few words about the brand, because that’s one of those “not-so-secret secrets” that is always fun to remind people of: GlucoRx does not actually design, or manufacture, these devices or the strips they use. Instead, they “rebadge” white label glucometers. The Nexus meter I previously looked at was also marketed by the Italian company Menarini and the German (if I understand that correctly) Aktivmed, but was actually manufactured by TaiDoc, a Taiwanese company, as the TD-4277. I say this is not so secret because… it’s not a secret. The name of TaiDoc, and the original model number are printed on the label at the bottom of the device.

Now, some manufacturers doing this kind of white label rebadging don’t really have “loyalty” to a single manufacturer, so when I saw that the Q required different strips from the Nexus, I thought it would be a different manufacturer this time, which brought up my hopes that I would have a fun reverse engineering project on my hands, but that turned out to be disappointed very quickly, as the device said it’s a TaiDoc TD-4235B.

A quick search on Wikidata turned out to be even more interesting than I expected, and showed that GlucoRx markets more of the TaiDoc devices too, including the Nexus Voice TD-4280. Interesting that the company does not have an entity at the time of writing, and that even the retracted article names TaiDoc twice, but GlucoRx 45 times. To make a comparison with computers, it’s like writing an article about a Surface Book laptop and keep talking about the CPU as if it was Microsoft’s.

Anyway, even though the manufacturer was the same, I was still hoping to have some fun reverse engineering it. That was also disappointed: it took me longer to set up Windows 10 in a virtual machine than it took me to make glucometerutils download the data from the meter. It looks like TaiDoc has a fairly stable protocol, which honestly surprised me, as a lot of the manufacturers appear to just try to make it harder to support their devices.

Indeed this meter also shows up with a CP2110-compatible HID endpoint, which meant I could reuse my already-written chatter script to extract the back-and-forth between the Windows app and the device, and confirm that it was pretty much the same as the Nexus. The only differences were the model number (which is still issued in little-endian BCD), and a couple of unknown bytes that weren’t as constants as I thought they were. I also updated the documentation.

Why did I say “CP2110-compatible” instead of just CP2110? Well, here’s the thing: the GlucoRx Q shows up on the kernel logs (and in Windows hardware notifications) as “Silicon Laboratories C8051F34x Development Board”. Sounds like someone forgot to flash in the magic strings, and that pretty much “broke the magic” of which platform these devices are based on. Again, not the biggest secret, but it’s always interesting.

As the name might already have given away, the Silicon Labs C8051F34x is an 8-bit microcontroller based on the 8051. Yes, the same architecture I used for Birch Books, and for which I complained about the lack of good FLOSS support (since there doesn’t seem to be any institutional money to improve). It appears that these MCUs don’t just include the 8051 core but also a whole suite of components that do make them very versatile for the use on glucometers, namely fast and precise Analog-to-Digital Converters (ADCs). It also appears to have an integrated USB-to-serial through the same HID protocol as the CP2110.

So, yeah, I’m considering doing one run of the Birch Books controller based on this particular MCU out of curiosity, because they come in a package that is still hand-solderable and include already USB support. But that’s a project for another time.

So putting the reverse engineering (or rather, the no lack of need of it) aside, let’s take a quick look at the meter as a meter.

Review

There is not much to say about this meter, because it’s your average “cheap” meter that you can find on online stores and pharmacies. I’m still surprised that most people don’t just get a free meter from one of the big names (in Italy, Ireland, and UK they are usually free — the manufacturers make their money on the strips), but this is not uncommon.

The GlucoRx Q is a fairly comfortable meter — unlike the Nexus, it’s “pill-shaped”, reminding me a lot of the Contour Next One. Unlike the Contour, this meter is not backlit, which means it’s not usable in dark places, but it also has a significantly bigger display.

The size does not compare entirely favourably with the FreeStyle Libre, part of the reason for which is that it runs off a single AAA battery — which makes it easy to replace, but puts some constraints on the size. On the bright side, the compartment door is captive to the main body so you don’t risk losing it if you were to change the battery on a moving vehicle, for instance.

The fitting of the strips is nice and solid, but I have to say getting blood onto them was quite harder than other meters, including the already mentioned Sinocare. Unlike other meters, there’s no lever to eject the strip without touching it — which makes me wonder once again if it’s a cultural reason for most of the Chinese meters to have it.

As usual for these reviews, I can’t really give an opinion on the accuracy — despite having spent many years looking at glucometers, I haven’t figured out a general way test these for accuracy. Most meters appear to have a calibration solution, but that’s not mean tot be compatible between devices, so I have no way to compare them to each other.

I don’t see any particular reason for getting or avoiding this particular device, to be honest. It seems to just be working fine, but at the same time I get other meters for free, and the strips are covered by NHS for me and all the diabetics — if anyone has any other reason on why to prefer this meter, I’d love to hear about it.

Service Announcement: Pawsome Players Streaming Week

You may remember I have been irregularly streaming on Twitch some of my FLOSS work, which focused almost exclusively on unpaper over the past few months. Now, unrelated to this, Cats Protection, the British no-profit whose primary goal is to rehome cats in need, launched their Pawsome Players initiative, aiming at raising fund through streaming — video games primarily, but not just.

With this in mind, I decided to join the fun, and will be streaming for the whole week at least an hour every day, and work on more FLOSS projects. You can find the full schedule (as well as donate to the campaign) on Tiltify, and if you want to get reminded of a particular night, you can join the events on the blog’s Facebook page.

In addition to wrapping up the Meson conversion of Unpaper, I’m planning to do a little bit of work on quite a few more other projects:

  • I have a new glucometer I want to reverse engineer, and with that comes an opportunity to see my way of working through this type of work; I’m not as entertaining and deep as Hector, but if you have never looked over the shoulder of a “black box” reverse engineer, I think it might be interesting. The code I’ll be working on is likely usbmon-tools rather than glucometerutils, but there’s a chance that I’ll get so far ahead I’ll actually implement the code.
  • Speaking of reverse engineering, I have a few adapters I designed (and got printed) for my Saleae Logic Pro 16. I have not released the schematics for those yet, but I now have the work approvals to. I should make a repository for them and release them, I’ll do that on stream!
  • I want to make some design changes to my Birch Books, which I’ll discuss on stream. It’s going to be a bit more PCB “designing” (I use quotes here, because I’m far from a designer, and more of a “cobbler together”) which is likely going to be scary for those who do know what they are doing.

I’m also open to the idea of doing some live code-reviews — I did lots of those when working at Google, and while for those I had a lot of specific standards to appeal to, a number of suggestions are nearly universal, and I have done this before where I was pointed at some Python project and gave some general advice of what I see. I’d be willing to take a random project and see what I can notice, if the author is willing!

Also, bonus points for anyone who guesses where the name of the fundraising page from.

So I hope I’ll hear from you on the stream chat (over on Twitch), and that you’ll join me in supporting Cats Protection to help find every kitty a forever home (me and my wife would love to be one of those homes, but it’s not easy when renting in London), and reach the £1985 target.

Senior Engineering: Give Them Time To Find Their Own Stretch

This blog post is brought to you by a random dream I had in December.

As I reflect more on what it means for me to be considered a senior engineer, I find myself picking up answers from my own dreams as well. In a dream this past December, I felt myself being at the office (which, given the situation, is definitely a dream state) but also in particular I found that I was hosting a number of people from my old high school.

As it turns out it’s well possible that the reason why I dreamt of my old high school is because I was talking with my wife about how I would like there to be an organization that would reach out in schools such as that one. But that’s not what the dream was about. And it probably was helped along by having seen a Boston Legal episode about high schools the night before.

In the dream, I was being asked by one of the visitors what would I would recommend them to do, so that more of their students would be successful, and work at big companies. Small aside: there were three of us from my high school, and my year, working in Google when I joined – as I understand one of them was in New York and then left, but I was never really in touch with him – the other guy was actually in Dublin the same as me for a while, and I knew him well enough that I visited him when he moved away.

What I answered was something along the lines of: just don’t fill them of homework that takes them time. They won’t need the homework. They need to find something they can run with. Give them time to find their own stretch.

This is likely part of my personal, subjective experience. I generally hated homework from the beginning, and ignored most of it most of the time, if I could get away with it (and sometimes even when I couldn’t). Except when the homework was something that I could build upon myself. So while on most of the days I would be coming home from school and jump on Ultima OnLine right after lunch, other times I would be toying with expanding on what my programming teacher gave me or, as I told already, with writing a bad 8086 emulator.

But it reminded me of the situation at work, as well — “stretch” being a common word used to refer to the work undertaken by tech employees outside of their comfort zone or level, to “reach” their next goal or level. This plays especially into big tech companies, where promotions are given to those who already act as the next level they’re meant to reach.

While thankfully I only experienced this particular problem first hand a few times, and navigated most of them away, this is not an uncommon situation. My previous employer uses quarterly OKR planning to set up the agenda for each quarter, and particularly in my organization, the resulting OKRs tended to over-promise, by overcommitting the whole team by design. That meant taking on enough hi-priority tasks (P0 as they are usually referred to) to take the whole time of many engineers just to virtually keep the lights on.

Most of this is a management problem to solve, particularly when it came to setting expectations with partner teams about the availability of the engineers who could be working on the projects, but there’s a number of things that senior engineers should do, in my opinion, to prevent this from burning junior engineers.

When there’s too many “vertical” stakeholders, I still subscribe to the concept of diagonal contributions – work on something that brings both you, and many if not all of your verticals, in a better place, even when you could bring a single vertical to a much better place ignoring all the others – but for that to work out well, you need to make the relationship explicit, having buy-in from the supported teams.

The other important point is just to make sure to “manage upwards”, as I’ve been repeatedly told to do in the previous bubble, making sure that, if you have juniors in your team, they are given the time for training, improving, collaborating. That means being able to steer your manager away from overcommitting those who need more time to ramp up. It’s not a hypothetical — with my last manager at my previous role, I had to explicitly say a few times “No, I’m not going to delegate this to Junior Person, because they are already swamped, let them breathe.”

As I said previously, I think that it’s part and parcel of a senior engineer job to let others do the work, and not being the one concentrating the effort — so how do you reconcile the two requirements? I’m not suggesting it’s easy, and it definitely varies situation by situation. In the particular case I was talking about, the team was extremely overcommitted, and while I could have left a task for Junior Person to work on for a quarter or maybe two, it would have gone to be an urgent task afterwards, as it was blocking a migration from another team. Since it would have taken me a fraction of the time, I found it more important for Junior Person to find something that inspired them, rather than something that our manager decided to assign.

Aside: a lot can be written about the dynamics of organizations where managers get to say “I know they said the deadline is Q4, but nobody believes that, we’ll work on it in Q1 next year.” The technical repercussion of this type of thinking is that migrations drag along a lot longer than intended and planned, and supporting the interim state makes them even more expensive on both the drivers and receivers of the migration itself. Personally, if I’m told that Q4 is a deadline, I always tried to work towards migrating as many of the simple cases by Q3. That allowed me identifying cases that wouldn’t be simple, making requests to the migration drivers if needed, and left Q4 available for the complex cases that needed a lot more work. If you don’t do that, you risk assuming that something is a simple case up until the deadline, and then find out it’s another quarter worth of project work to address it.

Of course, a lot of it comes down to incentives, as always. My understanding of the state of it before I left my previous company is that the engineers are not evaluated directly on their OKR completion, but managers are evaluated on their team’s. Making that clear and explicit was itself a significant life improvement I would say — particularly because this didn’t use to be the case, and I was not alone in having seen the completion rate as being a fantasy number that didn’t really do much — much has been written by others on OKR and planning, and I’m no expert at that, so I won’t go into details of how I think about this altogether.

Keeping this in mind, another factor in providing stretch to more junior engineers, is to provide them with goals that are flexible enough for them to run with something they found interesting — leaving your task as the senior person to make sure the direction is the right one. To give an example, this means that instead of suggesting one specific language for implementing a tool, you should leave it up to the engineer working on it to choose one — within the language choice the team has accept, clearly.

It also means not requiring specific solutions to problems, when the solution is not fully fleshed out already. Particularly when looking at problems that apply to heterogenous systems, with different teams maintaining components, it’s easy to fall into the trap of expecting the solution to be additive – write a new component, define a new abstraction layer, introduce a compatibility library – ignoring subtractive solutions — remove the feature that’s used once every two years, delete the old compatibility layer that is only used by one service that hasn’t been ported yet. Focusing on what the problem to be solved is, rather than how it needs to be solved, can give plenty of room to stretch.

At the very least this worked for me — one of the tasks I’ve been given in my first team was to migrate between two different implementations of a common provider. While I did do that to “stop the fire”, I also followed up by implementing a new abstraction on top of it, so I could stop copy-pasting the same the code in ten different applications. I maintained that abstraction until I left the company, and it went from something used within my team only, to the suggested solution to that class of problems.

The other part that I want to make sure people consider, is that deciding that what you were assigned is not important and that there’s better things to do, is not a bad thing. Trying to make it harder to point out “Hey, I was assigned (or picked up) this goal, but after talking with stakeholders directly there’s no need to work on it” is just going to lead to burnout.

To go back to the continuous example of my last team at the previous employer, I found myself having to justify myself for wanting to drop one of the goals from my OKR selection – complicated by the fact that OKR completion was put on my performance plan – despite the fact that the area expert already signed off on it. The reason? We were in a month already in the quarter, and it seemed to late in the game to change our mind. Except the particular goal was not the highest priority, and the first month of the quarter was pretty much dedicated full time to one that was. Once I started poking at the details of this lower priority goal, I realized it would not gain anything for anyone – it was trading technical debt in one place to technical debt in another – while we had a more compelling problem on a different project altogether.

Again, all of this does not lead to a cookie-cutter approach — even senior engineers joining a new space might need to be told what to do so that they are not just wasting air for a while. But if you’re selecting tasks for others to work on, make sure that they get an opportunity to extend on it. Don’t give them a fully-detailed “just write the code for this very detailed specification” if you can avoid it, but rather tell them “solve this problem, here’s the tools, feel free to ask for details”.

And make sure you are around to be asked questions, and to coach, and to show what they may not know. Again from personal experience, if someone comes to you saying that they’d like advice on how to write design docs that pass arbitrarily complicated review processes, the answer “Just write more of them” is not useful. Particularly when the process involves frustrating, time-consuming synchronous reviews with people wanting you to know the answer to all their “What if?” questions without having heard them before.

In short, if you want your juniors to grow and take your place – and in my opinion, you should want that – then you should make sure not to manage their time to the detail. Let them get passionate about something they had to fix, and own seeing it fixed.

Flashing ESPHome on White Label Smart Plugs

You may remember that I bought a smart plug a few years ago, for our Christmas tree, and had been wondering what the heck do you use a smart plug for. Well, a few months ago I found out another use case for it that I had not thought about to that point: the subwoofer.

When I moved to Dublin, nearly eight years ago now, I bought a new TV, and an AV receiver, and a set of speakers — and with these last one, came a subwoofer. I don’t actually like subwoofer that much, or at least I didn’t, because the low-frequency sounds give (or gave) me headaches, and so while it does sound majestic to watch a good movie with the subwoofer on, I rarely do that.

On the other hand, it was still preferable to certain noise coming from the neighbours, particularly if we ended up not being in the same room as the device itself. Which meant we started using the subwoofer more — but unlike the AV receiver, that can be controlled with a network endpoint, the subwoofer has a physical switch to turn on and off… a perfect use case for setting this up with a smart plug!

Since I didn’t want to reuse the original plug for the Christmas tree (among other things, it’s stashed away with the Christmas tree), I ended up going online and ordering a new one — and I selected a TP-Link, that I thought would work just the same as my previous one… except it didn’t. Turns out TP-Link has two separate Smart Home brands, and while they share the same account management, they use different apps and have different integration support. The Christmas tree uses a Kasa-branded plug, while the one I ordered was a Tapo-branded plug — the latter does not work with Home Assistant, which was a bit annoying, while the former worked well, when I used it.

When TP-Link then also got in the news for having “fixed” a vulnerability that allowed Home Assistant to control the Kasa plugs on the local network, I got annoyed enough that I decided to do something about it. (Although it’s good to note that nowadays Home Assistant works fine with the “fixed” Kasa plugs, too.)

I asked once again help to Srdjan who has more patience than me to look up random IoT projects out there, and he suggested me the way forward… which turned out a bit more tortuous than expected, so let me try to recount it here, showing shortcuts as much as I can at the time of writing

The Hardware

Things start a bit iffy to begin with — I say that because the hardware I ended up getting myself was Gosund SP111 from Amazon, after checking reviews that they were still shipping an older version of the hardware that can be converted with tuya-convert (I’ll get back to the software in a moment). There’s plenty of warnings about the new models, but in a very similar fashion to the problem with CGG1 sensors, there’s no way to know beforehand what version of a firmware you’re going to find.

But in particular, SP111 are not Gosund engineering’s own work. Tuya is a company that build “white label” goods for many different brands, and that’s how these plugs were made. You can find them under a number of different brand names, but they all share the same firmware and, critically, the same pairing and update keys.

One of the best part about these Tuya/Gosund plugs is that they are tiny in the sense of taking very little space. They are less than half the size of my first Kasa-branded TP-Link smart plug, and even when compared with the newer Tapo-branded one they are quite slimmer. This makes for adding them easily to places that are a bit on the constrained side.

The Firmware

I did not buy these devices to run with the original firmware. Unlike the CGG1 and pretty much everything else I’m running at home, this time I went straight for the “I want to run ESPHome on them.” The reason for that was on one side, I’m annoyed and burnt by the two TP-Link devices, and from the other, the price of Hue-compatible smart plugs was annoyingly high. That would have been my default alternative.

Of the various important things to me, Home Assistant support grown from a “meh, sure” to “yeah I want that” — and in part because I did manage to set up scripts for quite a bit of tools at home through it. One of my original concerns (wanting to still be able to control most of these features by voice) is taken care of by signing up for Nabu Casa, which provides integration for both Google Assistant and Alexa, so even with the plug running a custom and maybe janky, firmware, getting it to work with the rest of the house would be a breeze.

There’s other options for open source firmware for the SP111, as well as other smart plugs. Tasmota is one of the most commonly used for this, and there’s another called ESPurna. I have not spent time investigating them, because I already have one ESPHome device running, and it was easier to lower my cognitive load to use something I already kinda know. After all, the plan is to replace the Kasa plug once we take out the Christmas tree, which would remove not one but two integrations from Google and Amazon (and one from Home Assistant).

All of these options can be flashed the first time around with tuya-convert, even though the procedure ended up not being totally clear to me — although in part that was caused by my… somewhat difficult set up. This was actually part of the requirements. Most smart home devices appear to be built around ESP8622 or ESP32, which means you can, with enough convincing, flash them with ESPHome (I’m still convincing my acrylic lamp circuit board), but quite a few require you have physical access to the serial port. I wanted something I could flash without having to take it apart.

The way this works, in very rough terms, is that the Tuya-based devices can be tricked into connecting to a local open network, and from there with the right protocol, they can be convinced to dump their firmware and to update it with an arbitrary firmware binary. The tuya-convert repository includes pretty much all the things you need to set this up, neatly packaged in a nearly user-friendly way. I say nearly, because as it turns out there’s so much that can go wrong with it, that the frustration is real.

The Process.

Part 1: WiFi.

First of all, you need a machine that has a WiFi adapter that supports Access Point mode (AP mode/hostapd mode, those are the keywords to look for it). This is very annoying to know for sure, because manufacturers tend to use the same model number across hardware revisions (that may entirely change the chipset) and countries — after all, Matthew ended up turning to Xbox One controller adapters! (And as it turns out, he says they should actually support that mode, with a limited range.)

The usual “go-to” for this is to use a laptop which also has an Ethernet port. Unfortunately I don’t have one, and in particular, I don’t have a Linux laptop anymore. I tried using a couple of Live Distros to set this up, but for one reason or another they were a complete bust (I couldn’t even type on Ubuntu’s Terminal, don’t even ask.)

In my case, I have a NUC on top of my desk, and that’s my usual Linux machine (the one I’m typing on right now!) so I could have used that… except I did disable the WiFi in the firmware when I got it, since I wired it with a cable, and I didn’t feel the need to keep it enabled. This appears to have been a blessing in disguise for a while, or I would have been frustrated at one of the openSUSE updates in December, when a kernel bug caused the system to become unusable with the WiFi on. Which is what happened when I turned the WiFi on in the firmware.

Since I don’t like waiting, and I thought it would generally be a good idea to have at least one spare USB WiFi dongle at home (it would have turned useful at least once before), I went and ordered one on Amazon that people suggested might work. Except it probably got a hardware revision in the middle, and the one I received wasn’t suitable — or, some of the reports say that it depends on the firmware loaded on it; I really didn’t care to debug that too much once I got to that stage.

Fast forward a few weeks, the kernel bug is fixed, so I tried again. The tuya-convert script uses Docker to set up everything, so it sounded like just installing Docker and docker-compose on my openSUSE installation would be enough, right? Well, no. Somehow the interaction of Docker and KVM virtual machines had side effects on the networking, and when I tried I both lost connectivity to Home Assistant (at least over IPv6), and the tuya-convert script kept terminating by itself without providing any useful information.

So instead, I decided to make my life easier and more difficult at the same time.

Part 1.1: WiFi In A Virtual Machine

I didn’t want Docker to make a mess of my networking setup. I also wasn’t quite sure of what tuya-convert would be doing on my machine (yes, it’s open source, but hey I don’t have time to audit all of it). So instead of trying to keep running this within my normal openSUSE install, I decided to run this in a virtual machine.

Of course I need WiFi in the VM, and as I said earlier, I couldn’t just pass through the USB dongle, because it wouldn’t work with hostapd. But modern computers support PCI pass-through, when IOMMU is enabled. My NUC’s WiFi supports hostapd, and it’s sitting unused, since I connect to the network over a cable.

The annoying part was that for performance issues, IOMMU is disabled by default, at least for Intel CPUs, so you have to add intel_iommu=on for the option of passing through PCI devices to KVM virtual machines to be available. This thread has more information, but you probably don’t need all of it, as that focuses on passing through graphic cards, which is a much more complicated topic.

The next problem was what operating system to run in the VM itself. At first I tried using the LiveDVD of openSUSE — but that didn’t work out: the Docker setup that tuya-convert uses is pretty much installing a good chunk of Ubuntu, and it runs out of memory quickly, and when it does it throws a lot of I/O errors from the btrfs loaded into memory. Oops.

Missing a VM image of openSUSE Tumbleweed, I decided to try their image for JeOS instead, which is a stripped down version meant to be used in virtualized environments. This seemed to fit like a glove at first, but then my simplicity plans got foiled by not realizing that usually virtualized environments don’t care for WiFi. Although the utter lack of NetworkManager or any other WiFi tooling turned out to be handy to make sure that nothing tried to steal the WiFi away from tuya-convert.

In addition to changing the kernel package with a version that cares about WiFi drivers, you need to install the right firmware package for the card. After that, at least the first part is nearly taken care of — you will most likely need a few more tools, such as Git and your editor of choice. And of course, Docker and docker-compose.

And then, do yourself a favour, and turn off firewalld entirely in your virtual machine. Maybe I should have said earlier “Don’t let the VM be published to the Internet at large”, but I hope if you get to try to pass-through a WiFi device, you knew better than doing that anyway. The firewall is something that is not obvious is going to get in your way later when you set to run tuya-convert, and it’ll make the whole setup fail silently in the hardest way possible to debug.

Indeed, when I looked for my issues I found tons of posts, issues, blogs all complaining about the same symptom I had, which was all caused by having a firewall in place. The tuya-convert script does a lot of things to set up stuff, but it can’t easily take down a firewall, and that is a biggie.

Indeed, and I’ll repeat that later, at some point the instructions tell you to connect some other device to their network and suggest otherwise it might not be working. My impression is that this is done because if it doesn’t work, you shouldn’t try taking the next steps yet. But the problem is that there is no note anywhere to help you if it doesn’t work — and the reason for it failing is likely the firewall stopping the DHCP server from receiving the requests. Oops.

Part 2: The Firmware Blob

ESPHome configurations are… sometimes very personal. I have found one for the SP111 on the Home Assistant forums and adapted it, but… I don’t really feel like recommending that one. So I’m afraid I won’t take responsibility for how you configure your ESPHome firmware for the plug.

Also, once you have ESPHome on the device, changing the config is nearly trivial, from the Home Assistant integration, so I feel it’s important to have something working at first, and then worry about perfecting it.

I think someone will be confused here on why am I jumping on configuring the firmware blob before we got to convert the device to use it. The reason for that is that you want to have the binary file (either built locally or generated with the Home Assistant integration and downloaded), and you put it into tuya-convert/files/, you will be able to directly flash that version, without going through the intermediate step of using one of the bundled firmware just to be able to update to an arbitrary firmware. But to do that, it needs to happen before you complete the Docker setup.

So, find yourself a working config for the device on the forums (and if you find one that is maintained and templated, so that one can just drop it in and just configure the parameters, please let me know), and generate your ESPHome firmware from there.

Also note that the firmware itself identifies the specific device. This means you cannot flash more than one device with the same firmware or you’ll have quite the headache to sort them out afterwards. Not saying it isn’t possible, but I just found it easier to make the firmware for the devices I was going to flash, and then load each one. As usual, my favourite tool to remember what is what would be my label maker, so that I don’t mix up which one I flashed with which binary.

Part 3: The Conversion

Okay so here’s the inventory of what you should have by this point before we move on to the actual conversion:

  • a virtual machine with a passed-through WiFi card that is supported by hostapd;
  • an operating system in the VM with the drivers for the WiFi, Docker, docker-compose, and no firewall;
  • a checkout of tuya-convert repository;
  • one or more ESPHome firmware binary files (one per device you want to flash), in the files/ directory of the checkout.

Only at this point you can go and follow the instruction of tuya-convert: create Docker image, setup docker-compose, and run the image. The firmware files need to be added before creating the docker image, because docker-compose does not bind the external files/ directory at all.

Once the software starts, it’ll ask you to connect another device that is not the plug to the WiFi. I’m not entirely sure if it’s just for diagnostics, but in either case, you should be able to connect to the network — the password should be flashmeifyoucan, although I don’t think I’ve seen that documented anywhere except when googling around for other people having had issues with their process.

If you try this from your phone, you should be prompted to login into the WiFi network through a captive portal — the portal is just a single page telling you that the setup is completed. If your phone gets stuck in the “Obtaining IP Address” phase, like mine did, make sure you really took down the firewall. This got me stuck for a while because I thought that the Docker itself controlled the whole firewall settings — but that does not appear to be the case.

Final Thoughts

I guess that this guide is not going to be very useful, with the new versions of the SP111 not being supported by tuya-convert (and not clear if it can be supported), but since I have two plugs still unused, it helps me to have written down the process to avoid getting myself stuck again.

The plugs appear to have configurable sensors for voltage, amperage, and total wattage used — and the configuration of those is why I’m not comfortable sharing the config I’m using: I took someone’s from a forum post but I don’t quite agree with some of the choice made, some of the values appear fairly pointless to me.

Voltage monitoring would have been an useful piece of information when I was back in Italy — those who read the blog a long time ago might remember that the power company over there didn’t really have any decent power available. Over here it feels like it’s very stable, so I doubt we’ll notice anything useful with these.

Having Smart Home devices that don’t rely on cloud services is much more comfortable than otherwise. I do like the idea of being able to just ask one of the voice assistants to turn off the subwoofer while I’m playing Fallout 76, for sure — but it’s one thing to have the convenience, and another to depend on it to control it. And as I said some time ago, I disagree with the assertion that there cannot be a secure and safe IoT Smart Home (and yes, “secure” and “safe” are two separate concepts).

As for smart plugs in particular? I’m still not entirely sold, but I can see that there definitely are devices where trying to bring the smart in the device is unlikely to help. Not as many though — it’s still a problem to find something that cannot be served better by more fine-grained control. In the case of the subwoofer, most of the controls (volume, cross-over, phase) are manual knobx on the back of the device. Would it have made sense to have a “smart subwoofer” that can tweak all of those values from the Home Assistant interface? I would argue yes — but at the same time, I can see in this case an expense of £10 for a smart plug beats the idea of replacing the subwoofer entirely.

I honestly have doubts about the Christmas tree lights as well. Not that I expect to be able to control them with an app, but the “controller” for them seems to be fairly standard, so I do expect if I search AliExpress for some “smart” controller for those I will probably find something — the question is whether I would find something I can use locally without depending on an external cloud service from an unknown Chinese brand. So maybe I’ll go back to one of my oldest attempts at electronics (13 years ago!) and see what I can find.

By the way, if you’re curious what else I am currently planning to use these smart plugs on… I’m playing with the idea of changing my Birch Books to use 12V LEDs – originally meant for Gunpla and similar models – and I was thinking that instead of leaving it always-on, I can just connect it with the rest of the routines that we use to turn the “living” items on and off.

Kind Software

This post sprouts in part from a comment in my previous disclaim of support to FSFE, but it’s a standalone post, which is not related to my feelings towards FSFE (which I already covered elsewhere). It should also not be a surprise to long time followers, since I’m going to cover arguments that I have already covered, for better or worse, in the past.

I have not been very active as a Free Software developer in the past few years, for reasons I already spoke about, but that does not mean I stopped believing in the cause or turned away from it. At the same time, I have never been a fundamentalist, and so when people ask me about “Freedom 0”, I’m torn, as I don’t think I quite agree on what Freedom 0 consists of.

On the Free Software Foundation website, Freedom 0 is defined as

The freedom to run the program as you wish, for any purpose (freedom 0).

At the same time, a whole lot of fundamentalists seem to me to try their best to not allow the users to run the programs as they wish. We wouldn’t, otherwise, be having purity tests and crusade against closed-source components that users may want to actually use, and we wouldn’t have absurdist solutions for firmware, that involve showing binary blobs under the carpet, and just not letting the user ever update them.

The way in which I disagree with both formulation and interpretation of this statement, is that I think that software should, first of all, be usable for its intended purpose, and that software that isn’t… isn’t really worth discussing about.

In the case of Free Software, I think that, before any licensing and usage concern, we should be concerned about providing value to the users. As I said, not a novel idea for me. This means that software that that is built with the sole idea of showing Free Software supremacy, is not useful software for me to focus on. Operating systems, smart home solutions, hardware, … all of these fields need users to have long-term support, and those users will not be developers, or even contributors!

So with this in mind, I want to take a page out of the literal Susan Calman book, and talk about Kind Software, as an extension of Free Software. Kind Software is software that is meant for the user to use and to keep the user as its first priority. I know that a number of people would make this to be a perfect overlap and contrast, considering all Free Software as Kind Software, and all proprietary software as not Kind Software… but the truth is that it is significantly more nuanced than that.

Even keeping aside the amount of Free Software that is “dual-use” and that can be used by attackers just as much as defenders – and that might sometimes have a bit too much of a bias towards the attackers – you don’t need to look much further than the old joke about how “Unix is user friendly, it’s just very selective of who its friends are”. Kind software wouldn’t be selective — the user use-cases are paramount, any software that would be saying “You don’t do that with {software}, because it is against my philosophy” would by my definition not be Kind Software.

Although, obviously, this brings us back to the paradox of tolerance, which is why I don’t think I’d be able to lead a Kind Software movement, and why I don’t think that the solution to any of this has to do with licenses, or codes of ethics. After all, different people have different ideas of what is ethical and what isn’t, and sometimes you need to make a choice by yourself, without fighting an uphill battle so that everyone who doesn’t agree with you is labelled an enemy. (Though, if you think that nazis are okay people, you’re definitely not a friend of mine.)

What this tells me that I can define my own rules for what I consider “Kind Software”, but I doubt I can define them for the general case. And in my case, I have a mixture of Free Software and proprietary software in the list, because I would always select the tools that first get their job done, and second are flexible enough for people to adapt. Free Software makes the latter much easier, but too often is the case that the former is not the case, and the value of a software that can be easily modified, but doesn’t do what I need is… none.

There is more than that of course. I have ranted before about the ethical concerns with selling routers, and I’ve actually been vocal as a supporter for law requiring businesses to have their network equipment set up by a professional — although with a matching relaxation of the requirements to be considered a professional. So while I am a strong believer in the importance of OpenWRT I do think that trying to suggest it as a solution for general final users is unkind, at least for the moment.

On the other side of the room, Home Assistant to me looks like a great project, and a kind one to it. The way they handled the recent security issues (in January — pretty much just happened as I’m writing this) is definitely part of it: warned users wherever they could, and made sure to introduce safeties to make sure that further bugs in components that they don’t even support wouldn’t introduce this very same problem again. And most importantly, they are not there to tell you how to use your gadgets, they are there to integrate with whatever is possible to.

This is, by the way, the main part of the reason why I don’t like self-hosting solutions, and why I would categorically consider software needing to be self-hosted as unkind: it puts the burden of it not being abused on the users themselves, and unless their job is literally to look after hosted services, it’s unlikely that they will be doing a good job — and that’s without discussing the fact that they’d likely be using time that they meant to be spending on something else just to keep the system running.

And speaking of proprietary, yet kind, software — I have already spoken about Abbott’s LibreLink and the fact that my diabetes team at the hospital is able to observe my glucose levels remotely, in pretty much real-time. This is obviously a proprietary solution, and not a bug-free one at that, and I’m also upset they locked it in, but it is also a kind one: the various tools that don’t seem to care about the expiration dates, that think that they can provide a good answer without knowing the full extent of the algorithm involved, and that insist it’s okay to not wait for the science… well, they don’t sound kind to me: they not just allow access to personal data, which would be okay, but they present data that might not be right for people to take clinical decisions and… yeah that’s just scary to me.

Again, that’s a personal view on this. I know that some people are happy to try open-source medical device designs on themselves, or be part of multi-year studies for those. But I don’t think it’s kind to expect others to do the same.

Unfortunately, I don’t really have a good call to action here, except to tell Free Software developers to remember to be kind as well. And to think of the implications of the software they write. Sometimes, just because we’re able to throw something out there, doesn’t mean it’s the kind thing to do so.