Oh Gosh, Trying to Find a New Email Provider in 2020

In the year 2020, I decided to move out of my GSuite account (née Google Apps for Business), which allowed me to use Gmail for my personal domain, and that I have used for the past ten years or so. It’s not that I have a problem with Gmail (I worked nearly seven years for Google now, why would it be a problem?) or that I think the service is not up to scratch (as this experience is proving me, I’d argue that it’s still the best service you can rely upon for small and medium businesses — which is the area I focused on when I ran my own company). It’s just that I’m not a business and the features that GSuite provides over the free Gmail no longer make up for the services I’m missing.

But I still wanted to be able to use my own domain for my mail, rather than going back to the standard Gmail domain. So I decided to look around, and migrate my mail to another paid, reliable, and possibly better solution. Alas, the results after a week of looking and playing around are not particularly impressive to me.

First of all I discarded, without even looking at it, the option of self-hosting my mail. I don’t have the time, nor the experience, nor the will to have to deal with my own email hosting. It’s a landmine of issues and risks and I don’t intend to accept them. So if you’re about to suggest this, feel free to not comment. I’m not going to entertain those suggestions anyway.

I ended up looking at what people have been suggesting on Twitter a few times and evaluated two options: ProtonMail and FastMail. I ended up finding both lacking. And I think I’m a bit more upset with the former than the latter, for reasons I’ll get to in this (much longer than usual) blog post.

My requirements for a replacement solution were to have a reliable webmail interface, with desktop notifications. A working Android app. And security at login. I was not particularly interested in ProtonMail’s encrypt-and-sign everything approach, but I could live with that. But I wanted something that wouldn’t risk letting everyone in with just a password, so 2FA was a must for me. I was also hoping to find something that would make it easy to deal with git send-email, but I ended up accepting right away that nothing would be anywhere close to the solution that we found with Gmail and GSuite (more on that later.)

Bad 2FA Options For All

So I started by looking at the 2nd Factor Authentication options for the two providers. Google being the earliest adopter of the U2F standard means of course that this is what I’ve been using, and would love to keep using once I replace it. But of the two providers I was considering, only FastMail stated explicitly it supported U2F. I was told that ProtonMail expects to add support for it this year, but I couldn’t even tell that from their website.

So I tried first FastMail, which has a 30 days free trial. To set up the U2F device, you need to provide a phone number as a recovery option — which gets used for SMS OTP. I don’t like SMS OTP because it’s not really secure (in some countries taking over a phone number is easier than taking over an email address), and because it’s not reliable the moment you don’t have mobile network services. It’s easy to mistake the “no access to mobile network” with “no access to Internet” and say that it doesn’t really matter, but there are plenty of places where I would be able to reach the Internet and not receive SMS: planes, tube platforms, the office when I arrived in London, …

But surely U2F is enough, why am I even bothering complaining about SMS OTP, given that you can disable it once the U2F security key is added? Well, turns out that when I tried to login on the Android app, I was just sent an SMS with the OTP to log myself in. Indeed, after I removed the phone number backup option, the Android app threw me a lovely error of «U2F is your only two-step verification method, but this is not supported here.» On Android, which can act as an U2F token.

As I found out afterwards, you can add a TOTP app as well, which solves the issue of logging in on Android without mobile network service, but by that point I had already started looking at ProtonMail, because it was not the best first impression to start with.

ProtonMail and the Bridge of Destiny

ProtonMail does not provide standard IMAP/SMTP access, because encryption (that’s the best reason I can get from the documentation, I’m not sure at all what this was all about, but honestly, that’s as far as I care to look into it). If you want to use a “normal” mail agent like Thunderbird, you need to use a software, accessible to paying customers only, that acts as “bridge”. As far as I can tell after using it, it appears to be mostly a way to handle the authentication rather than the encryption per se. Indeed, you log into the Bridge software with username, password and OTP, and then it provides localhost-only endpoints for IMAP4 and SMTP, with a generated local password. Neat.

Except it’s only available in Beta for Linux, so instead I ended up running it on Windows at first.

This is an interesting approach. Gmail implemented, many years ago, a new extension to IMAP (and SMTP) that allows using OAuth 2 for IMAP logins. This effectively delegates the login action to a browser, rather than executing it inline in the protocol, and as such it allows to request OTPs, or even supporting U2F. Thunderbird on Windows does work very well with this and even supports U2F out of the box.

Sidenote: Thunderbird seems to have something silly going on. When you add a new account to it, it has a drop-down box to let you select the authentication method (or go for “Autodetect”). Unfortunately, the drop-down does not have the OAuth2 option at all. Even if you select imap.gmail.com as the server — I know hardcoding is bad, but not allowing it at all sounds worse. But if you cheat and give it 12345 as password, and select password authentication just to go through with adding the account, then you can select OAuth 2 as authentication type and it all works out.

Anyway, neither ProtonMail nor FastMail appear to have implemented this authentication method, despite the fact that, if I understood that correctly, it’s supported out of the box on Thunderbird, Apple’s Mail, and a bunch of other mail clients. Indeed, if you want to use IMAP/SMTP with FastMail, they only appear to give you the option to use application-specific passwords, which are a shame.

So why did I need IMAP access to begin with? Well, I wanted to import all my mail from Gmail into ProtonMail, and I though the easier way to do so was going to be through Thunderbird and manually copy the folders I needed. That turned out to be a mistake: Thunderbird crashed while trying to copy some of the content over, and I effectively was spending more time while waiting for it to index anything than instructing it on what to do.

Luckily there’s alternative options for this.

Important Importing Tooling

ProtonMail provides another piece of software, in addition to the Bridge, to paying customers: an Import Tool. This allows you to login to another IMAP server, and copy over the content. I decided to use that to copy over my Gmail content to ProtonMail.

First of all, the tool does not support OAuth2 authentication. To be able to access Gmail or GSuite mailboxes, it needs to use an Application-Specific Password. Annoying but not a dealbreaker for me, since I’m not enrolled in the Advanced Protection Program, which among other things disable “less-secure apps” (i.e. those apps using Application-Specific Passwords). I generated one, logged in, and selected the labels I wanted to copy over, then went to bed, a little, but not much, concerned over the 52 and counting messages that it said it was failing to import.

I woke up to the tool reporting only 32% of around fifty thousands messages imported. I paused, then resumed, the import hoping to getting it unstuck, and left to play Pokémon with my wife, coming back to a computer stuck exactly at the same point. I tried stopping and closing the Import Tool, but that didn’t work, it got stuck. I tried rebooting Windows and it refused to, because my C: drive was full. Huh?

When I went to look into it, I found a 436GB text file, that’s the log from the software. Since the file was too big to open with nearly anything on my computer, I used good old type, and beside the initial part possibly containing useful information, most of the file repeated the same error message about not being able to parse a mime type, with no message ID or subject attached. Not useful. I had to delete the file, since my system was rejecting writes because of the drive being full, but it also does not bode well for the way the importer is written: clearly there’s no retry limit on some action, no log coalescing, and no security feature to go “Hang on, am I DoSing the local system?”

I went looking for tools I could use to sync IMAP servers manually. I found isync/mbsync, which as a slightly annoyance is written in C and needs to be built, so not easy to run on Windows where I do have the ProtonMail bridge, but not something I can’t overcome. When I was looking at the website, it said to check the README for workarounds needed with certain servers. Unfortunately at the time of writing the document, in the Compatibility section, refers to “M$ Exchange” — which in 2020 is a very silly, juvenile, and annoying way to refer to what is possibly still the largest enterprise mail server out there. Yes, I am judging a project by its README the way you judge a book by its cover, but I would expect that a project unable to call Microsoft by its name in this day and age is unlikely to have added support for OAuth2 authentication or any of the many extensions that Gmail provides for efficient listing of messages.

I turned to FastMail to see how they are implementing it: importing Gmail or GSuite content can be done directly on their server side: they require you to provide OAuth2 access to all your email (but then again, if you’re planning to use them as your service provider, you kind of are already doing that). It does not allow you to choose which labels you want to import: it’ll clone everything, even your trash/bin folder. So at the time of writing it’s importing 180k messages. It took a while, and it showed the funny result of saying «175,784 of 172,368 messages imported.» Bonus point to FastMail for actually sending the completion note as an email, so that it can be fetched accordingly.

A side effect of FastMail doing the imports server side is that there’s no way for you to transfer ProtonMail boxes to FastMail, or any other equivalent server with server-side import: the Bridge needs to run on your local system for you to authenticate. It’s effectively an additional lock-in.

Instead of insisting on self-hosting options, I honestly feel that the FLOSS activists should maybe invest a little more thought and time on providing ways for average users with average needs to migrate their content, avoiding the lock-in. Because even if the perfect self-hosting email solution is out there, right now trying to migrate to it would be an absolute nightmare and nobody will bother, preferring to stick to their perfectly-working locked-in cloud provider.

Missing Features Mayhem

At that point I was a bit annoyed, but I had no urgency to move the old email away, for now at least. So instead I went on to check how ProtonMail worked as primary mail interface. I changed MX around, set up the various verification methods, and waited. One of the nice things of migrating the mail provider is that you end up realizing just how many mailing lists and stuff you keep receiving, that you previously just filed away with filters.

I removed a bunch of subscriptions to open source mailing lists for projects I am no longer directly involved in, and unlikely to go back to, and then I started looking at other newsletters and promotions. For at least one of them, I thought I would probably be better served by NewsBlur‘s newsletter-to-RSS interface. As documented in the service itself, the recommended way to use this is to create a filter that takes the input newsletter and forwards them to your newsblur alias.

And here’s the first ProtonMail feature that I’m missing: there’s no way to set up forwarding filters. This is more than a bit annoying: there was mail coming to my address that I used to forward to my mother (mostly bills related to her house, before I set up a separate domain with multiple aliases that point at our two addresses), and there still are a few messages that come to me only, that I forward to my wife, where using our other alias addresses is not feasible for various reasons.

But it’s not just a matter of forwards that is missing. When I looked into the filter system of ProtonMail I found it very lacking. You can’t filter based on an arbitrary header. You cannot filter based on a list-id! Despite the webmail being able to tell that an email came through from a mailing list, and providing an explicit Unsubscribe button, based on the headers, it neither has a “Filter messages like these” like Gmail has, nor a way to select this manually. And that is a lot more annoying.

FastMail, by comparison, provides a much more detailed rules support, including the ability to provide them directly in Sieve language, and it allows forward-and-delete of email as well, which is exactly what the NewsBlur integration needs (although to note, while you can see the interface for do that, trial accounts can’t set up forwarding rules!) And yes, the “Add Rule from Message” flow defaults to the list identifier for the messages. Also, to one-up even Gmail on this, you can set those rules from the mobile app as well — and if you think this is not that big of a deal, just think of much more likely you are to have spare time to do this kind of boring tasks while waiting for your train (if you commute by train, that is).

In terms of features, it seems like FastMail has the clear upper hand. Even ignoring the calendar provided, it supports the modern “Snooze” concept, letting mail show up later in the day or the week (which is great when, say, you don’t want to keep the unread email about your job interviews to show up on your mail inbox at the office), and it even has the ability to permanently delete messages in certain folders on after a certain amount of days — just like gmaillabelpurge! I think this last feature is the one that made me realize I really just need to use FastMail.

Sending It All Out

As I said earlier, even before trying to decide which one of the two providers to try, I gave up on the idea of being able to use either of them with git send-email to send kernel patches and similar. Neither of them supports OAuth2 authentication, and I was told there’s no way to set up a “send-only” environment.

My solution to this was to bite the bullet and deal with a real(ish) sendmail implementation again, by using a script that would connect over SSH to one of my servers, and use the postfix instance there (noting that I’m trying to cut down on having to run my own servers). I briefly considered using my HTPC for that, but then I realized that it would require me to put my home IP addresses in the SPF records for my domain, and I didn’t really want to publicise those as much.

But it turned out the information I found was incorrect. FastMail does support SMTP-only Application Specific Passwords! This is an awesomely secure feature that not even Gmail has right now, and it makes it a breeze to configure Git for it, and the worst that can happen is that someone can spoof your email address, until you figure it out. That does not mean that it’s safe to share that password around, but it does make it much less risky to keep the password on, say, your laptop.

I would even venture that this is even safer than the sendgmail approach that I linked above, as the other one requires full mail access with the token, which can easily be abused by an attacker.

Conclusion

So at the end of this whole odyssey, I decided to stick with FastMail.

ProtonMail sounds good on paper, but it give me the impression that it’s overengineered in implementation, and not thought out enough in feature design. I cannot otherwise see how many basic features (forwarding filters, send-only protocol support — C-S-c to add a CC line) would otherwise be missing. And I’m very surprised about the security angle for the whole service.

FastMail does have some rough edges, particularly on their webapp. Small things, like being able to right-click to get a context menu would be nice. U2F support is clearly lacking: having it work on their Android app for me would be a huge step forward. And I should point out that FastMail has a much friendlier way to test its service, as the 30 days free option includes nearly all of the functionality and enough space to test an import of the data from a 10 years old Gmail.

FreeStyle Libre 2: Notes From The Deep Dive

As I wrote last week, I’ve started playing with Ghidra to dive into the FreeStyle Libre 2 software, to try and figure out how to speak the encrypted protocol, which is in the way to access the Libre 2 device as we already access the Libre 1.

I’m not an expert when it comes to binary reverse engineering — most of the work I’ve done around reverse engineering has been on protocols that are not otherwise encrypted. But as I said in the previous post, the binary still included a lot of debug logs. In particular, the logs included the name of the class, and the name of the method, which made it fairly easy to track down quite a bit of information on how the software works, as well as the way the protocols work.

I also got lucky to find a second implementation of their software protocol. At least a partial one. You see, there’s two software that can communicate with the old Libre system: the desktop software that appears to be available in Germany, Australia, and a few other countries, and the “driver” for LibreView, a service that allows GPs, consultants, and hospitals to remotely access the blood sugar readings of their patients. (I should write about it later.) While the main app is a single, mostly statically linked Qt graphical app, the “driver” is composed of a number of DLL modules, which makes it much easier to read.

Unfortunately it does not appear to support the Libre 2 and its encryption, but it does help to figure out other details around the rest of the transport protocol, since it’s much better logged, and provides clearer view of the function structure — it seems like the two packages actually come from the same codebase, as a number of classes share the same name between the two implementations.

The interesting part is trying to figure out what the various codenames mean. I found the names Orpheus and Apollo in the desktop app, and I assumed the former was the Libre and the latter the Libre 2, because the encryption is implemented only on the Apollo branch of the hierarchy, in particular in a class called ApolloCryptoLib. But then again, in the “driver” I found the codenames Apollo and Athena — and since the software says it supports the “Libre Pro” (which as far as I know is the US-only version that was released a few years ago), I’m wholly confused on what’s what now.

But as I said, the software does have parallel C++ class hierarchies, implementing lower-level and higher-level access controls for the two codenames. And because the logs include the class name it looks like most functions are instantiated twice (which is why I found it easier to figure out the flow for the non-crypto part from the “driver” module.) A lot of the work I’m doing appears to be manual vtable decoding, since there’s a lot of virtual methods all around.

What also became very apparent is that my hunch was right: the Libre 2 system uses basically the same higher level protocol as the Libre 1. Indeed, I can confirm not only that the text commands sent are the same (and the expected responses are the same, as well), but also that the binary protocol is parsed in the same way. So the only obstacle between glucometerutils and the Libre 2 is the encryption. Indeed, it seems like all three devices use the same protocol, which is either called Shazam, AAP or ATP — it’s not quite clear given the different set of naming conventions in the code, but it’s still pretty obvious that they share the same protocol, not just the HID transport, but also for defining higher level commands.

Now about the encryption, what I found from looking at the software is that there are two sets of keys that are used. The first is used in the “authentication” phase, which is effectively a challenge-response between the device and the software, based on the serial number of the device, and the other is used in the encrypted communication. This was fairly easy to spot, because one of the classes in the code is named ApolloCryptoLib, and it included functions with names like Encrypt, Decrypt, and GenerateKeys.

Also one note that important: the patch (sensor) serial number is not used for the encryption of the reader’s access. This is something that comes up time and time again. Indeed at least a few people have been telling me on Twitter that the Libre 2 sensors (or patches, as Abbott calls them) are also encrypted and that clearly they use the same protocol for the reader. But that’s not the case at all. Indeed, the same encryption happens when no patch was ever initialized, and the information on the patches is fetched from the reader as the last part of the initialization.

Another important piece of information that I found in the code is that the encryption uses separate keys for encryption and MAC. This means that there’s an actual encryption transport layer, similar to TLS, but not similar enough to worry me so much regarding the key material present.

With the code at hand, I also managed to confirm my original basic assumptions about the initialization using sub-commands, where the same message type is sent with a follow-up bytes including information on the command. The confirmation came from a log message calling the first byte in the command… subcmd. The following diagram is my current best understanding of the initialization flow:

Initialization sequence for the FreeStyle Libre 2 encryption protocol.

Unfortunately, most of the functions that I have found related to the encryption (and to the binary protocol, at least in the standalone app) ended up being quite complicated to read. At first I thought this was a side effect of some obfuscation system, but I’m no longer sure. It might be an effect of the compile/decompile cycle, but at least on Ghidra these appear as huge switch blocks, with what is effectively a state machine jumping around, even for the most simple of the methods.

Preview(opens in a new tab)

I took a function that is (hopefully) the least likely to get Abbott upset for me reposting it. It’s a simple function: it takes an integer and returns an integer. I called it int titfortat(int) because it took me a while to figure out what it was meant to do. It turns out to normalize the input to either 0, 1 or -1 — the latter being an error condition. It has an invocation of INT3 (a debugger trap), and it has the whole state machine construct I’ve seen in most of the other functions. What I found about this function is that it’s used to set a variable based on whether the generated keys are used for authentication or session.

The main blocker for me right now to figure out how the encryption is working, is that it looks like there’s an array of 21 different objects, each of which comes with what looks like a vtable, and only partially implemented. It does not conform to the way Visual C++ is building objects, so maybe it’s a static encryption library linked inside, or something different altogether. The functions I can reach from those objects are clearly cryptography-related: they include tables for SHA1 and SHA2 at least.

The way the objects are used is also a bit confusing: an initialization function appears to assign to each pointer in the array the value returned by a different function — but each of the functions appear to only return the value of a (different) global. Whenever the vtable-like is not fully implemented, it appears to be pointing at code that simply return an error constant. And when the code is calling those objects, if an error is returned it skips the object and go to the next.

On the other hand, this exercise is giving me a lot of insights about the insight of the overall HID transport as well as the protocol inside of it. For example, I finally found the answer to which checksum the binary messages include! It’s a modified CRC32, except that it’s calculated over 4-bit at a time instead of the usual 8, and thus requires a shortened lookup table (16 entries instead of 256) — and if you think that this is completely pointless, I tend to agree with you. I also found that some of the sub-commands for the ATP protocol include an extra byte before the actual sub-command identifier. I’m not sure how those are interpreted yet, and it does not seem to be a checksum, as they are identical for different payloads.

Anyway, this is clearly not enough information yet to proceed with implementing a driver, but it might be just enough information to start improving the support for the binary protocol (ATP) if the Libre 2 turns out not to understand the normal text commands. Which I find very unlikely, but you we’ll have to see.

IPv6 in 2020 — Nope, still dreamland

It’s that time of the year: lots of my friends and acquaintances went to FOSDEM, which is great, and at least one complained about something not working over IPv6, which prompted me to share once again my rant over the newcomer-unfriendly default network of a a conference that is otherwise very friendly to new people. Which then prompted the knee-jerk reaction of people who expect systems to work in isolation, calling me a hater and insulting me. Not everybody, mind you — on Twitter I did have a valid and polite conversation with two people, and while it’s clear we disagree on this point, insults were not thrown. Less polite people got blocked because I have no time to argue with those who can’t see anyone else’s viewpoint.

So, why am I insisting that IPv6 is still not ready in 2020? Well, let’s see. A couple of years ago, I pointed out how nearly all of the websites that people would use, except for the big social networks, are missing IPv6. As far as I could tell, nothing has changed whatsoever for those websites in the intervening two years. Even the number of websites that are hosted by CDNs like Akamai (which does support IPv6!), or service providers like Heroku are not served over IPv6. So once again, if you’re a random home user, you don’t really care about IPv6, except maybe for Netflix.

Should the Internet providers be worried, what with IPv4 exhaustion getting worse and worse? I’d expect them to be, because as Thomas said on Twitter, the pain is only going to increase. But it clearly has not reached the point where any of the ISPs, except a few “niche” ones like Andrews & Arnold, provide their own website over IPv6 — the exception appears to be Free, who if I understood it correctly, is one of the biggest providers in France, and does publish AAAA records for their website. They are clearly in the minority right now.

Even mobile phone providers, who everyone and their dog appear to always use as the example of consumer IPv6-only networks, don’t seem to care — at least in Europe. It looks like AT&T and T-Mobile US do serve their websites over IPv6.

But the consumer side is not the only reason why I insist that in 2020, IPv6 is still fantasy. Hosting providers don’t seem to have understood IPv6 either. Let’s put aside for a moment that Automattic does not have an IPv6 network (not even outbound), and let’s look at one of the providers I’ve been using for the past few years: Scaleway. Scaleway (owned by Iliad, same group as Online.net) charges you extra for IPv4. It does, though, provide you with free IPv6. It does not, as far as I understand, provide you with multiple IPv6 per server, though, which is annoying but workable.

But here’s a quote from a maintenance email they sent a few weeks ago:

During this maintenance, your server will be powered off, then powered on on another physical server. This operation will cause a downtime of a few minutes to an hour, depending on the size of your local storage. The public IPv4 will not change at migration, but the private IPv4 and the IPv6 will be modified due to technical limitations.

Scaleway email, 2020-01-28. Emphasis theirs.

So not only the only stable address the servers could keep is the IPv4 (which, as I said, is a paid extra), but they cannot even tell you beforehand which IPv6 address your server will get. Indeed, I decided at that point that the right thing to do was to just stop publishing AAAA records for my websites, as clearly I can’t rely on Scaleway to persist them over time. A shame, I would say, but that’s my problem: nobody is taking IPv6 seriously right now but a few network geeks.

But network geeks also appear to like UniFi. And honestly I do, too. It worked fairly well for me, most of the time (except for the woes of updating Mongodb), and it does mostly support IPv6. I have a full IPv6 setup at home with UniFi and Hyperoptic. But at the same time, the dashboard is only focused on IPv4, everywhere. A few weeks ago it looked like my IPv6 network had a sad (I only noticed because I was trying to reach one of my local machines with its AAAA hostname), and I had no way to confirm it was the case: I eventually just rebooted the gateway, and then it worked fine (and since I have a public IPv4, Hyperoptic gives me a stable IPv6 prefix, so I didn’t have to worry about that), but even then I couldn’t figure out if the gateway got any IPv6 network connection from its UIs.

I’m told OpenWRT got better about this. You’re no longer required to reverse engineer the source to figure out how to configure a relay. But at the same time, I’m fairly sure they are again niche products. Virgin Media Ireland’s default router supported IPv6 — to a point. But I have yet to see any Italian ISP providing even the most basic of DS-Lite by default.

Again, I’m not hating on the protocol, or denying the need to move onto the new network in short term. But I am saying that network folks need to start looking outside of their bubble, and try to find the reasons for why nothing appears to be moving, year after year. You can’t blame it on the users not caring: they don’t want to geek out on which version of the Internet Protocol they are using, they want to have a working connection. And you can’t really expect them to understand the limits of CGNs — 64k connections might sound ludicrously few to a network person, but for your average user it sounds too much: they only are looking at one website at a time! (Try explaining to someone who has no idea how HTTP works that you get possibly thousands of connections per tab.)

Leveling up my reverse engineering: time for Ghidra

In my quest to figure out how to download data from the Abbott FreeStyle Libre 2, I decided that figuring it out just by looking at the captures was a dead end. While my first few captures had me convinced the reader would keep sending the same challenge, and so I could at least replay that, it turned out to be wrong, and that excluded most of the simplest/silliest of encryption schemes.

As Pierre suggested me on Twitter, the easiest way to find the answer would be to analyze the binary itself. This sounded like a hugely daunting task for myself, as I don’t speak fluent Intel assembly, and particularly I don’t speak fluent Windows interfaces. The closest I got to this in the past has been the reverse engineering of the Verio, in which I ran the software on top of WinDbg and discovered, to my relief, that they not just kept some of the logging on, but also a whole lot of debug logs that made it actually fairly easy to reverse the whole protocol.

But when the options are learning enough about cryptography and cryptanalysis to break the encoding, or spend time learning how to reverse engineer a Windows binary — yeah I thought the latter would suit me better. It’s not like I have not dabbled in reversing my own binaries, or built my own (terrible) 8086 simulator in high school, because I wanted to be able to see how the registers would be affected, and didn’t want to wait for my turn to use the clunky EEPROM system we used in the lab (this whole thing is probably a story for another day).

Also, since the last time I considered reversing a binary (when I was looking at my laptop’s keyboard), there’s a huge development: the NSA released Ghidra. For those who have not heard, Ghidra is a tool to reverse engineer binaries that includes a decompiler, and a full blown UI. And it’s open source. Given that the previous best option for this was IDA Pro, with thousands of dollars of licenses expected, this opened a huge amount of doors.

So I spent a weekend in which I had some spare time to try my best on reversing the actual code coming from the official app for the Libre 2 — I’ll provide more detail of that once I understand it better, and I know which parts are critical to share, and which one would probably get me in trouble. In general, I did manage to find out quite a bit more about the software, the devices, and the protocol — if nothing else, because Abbott left a bunch of debug logging disabled, but built in (and no, this time the WinDbg trick didn’t work because they seems to have added an explicit anti-debugger exception (although, I guess I could learn to defeat that, while I’m at it).

Because I was at first a bit skeptical about my ability to do anything at all with this, I also have been running it in an Ubuntu VM, but honestly I’m considering switching back to my normal desktop because something on the Ubuntu default shell appears to mess with Java, and I can’t even run the VM at the right screen size. I have also considered running this in a Hyper-V virtual machine on my Gamestation, but the problem with that appears to be graphics acceleration: installing OpenSUSE onto it was very fast, but trying to use it was terribly sloppy. I guess the VM option is a bit nicer in the sense that I can just save it to power off the computer, as I did to add the second SSD to the NUC.

After spending the weekend on it, and making some kind of progress, and printing out some of the code to read it on paper in front of the TV with pen and marker, well… I think I’m liking the idea of this but it’ll take me quite a while, alone, to come up with enough of a description that it can be implemented cleanroom. I’ll share more details on that later. For the most part, I felt like I was for the first time cooking something that I’ve only seen made in the Great British Bake Off — because I kept reading the reports that other (much more experienced) people wrote and published, particularly reversing router firmwares.

I also, for once, found a good reason to use YouTube for tutorials. This video by MalwareTech (yes the guy who got arrested after shutting WannaCry down by chance) was a huge help to figure out features I didn’t even know I wanted, including the “Assume Register” option. Having someone who knows what he’s doing explore a tool I don’t know was very helpful, and indeed it felt like Adam Savage describing his workshop tools — a great way to learn about stuff you didn’t know you needed.

My hope is that by adding this tool to my toolbox – like Adam Savage indeed says in his Every Tool’s A Hammer (hugely recommended reading, by the way) – is that I’ll be able to use it not just to solve the mystery of the Libre 2’s encryption. But also that of the nebulous Libre 1 binary protocol, which I never figured out (there’s a few breadcrumbs I already found during my Ghidra weekend). And maybe even to figure out the protocol of one of the gaming mice I have at home, which I never completed either.

Of course all of this assumes I have a lot more free time than I have had for the past few years. But, you know, it’s something that I might have ideas about.

Also as a side note: while doing the work to figure out which address belongs to what, and particularly figure out the jumps through vtables and arrays of global objects (yeah that seems to be something they are doing), I found myself needing to do a lot of hexadecimal calculations. And while I can do conversions from decimal to binary in my head fairly easily, hex is a bit too much for me. I have been using the Python interactive interpreter for that, but that’s just too cumbersome. Instead, I decided to get myself a good old physical calculator — not least because the Android calculator is not able to do hex, and it seems like there’s a lack of “mid range” calculators: you get TI-80 emulators fairly easily, but most of the simplest calculators don’t have hex. Or they do, but they are terrible at it.

I looked up on Amazon for the cheapest scientific calculator that I could see the letters A-F on, and ordered a Casio fx-83GT X — that was a mistake. When it arrived, I realized that I didn’t pay attention to finding one with the hex key on it. The fx-83GT does indeed have the A-F inputs — but they are used for defining variables only, and the calculator does not appear to have any way to convert to hexadecimal nor to do hexadecimal-based operations. Oops.

Instead I ordered a Sharp WriteView EL-W531, which supports hex just fine. It has slightly smaller, but more satisfying, keys, but it’s yet another black gadget on my table (the Casio is light blue). I’ll probably end up looking out for a cute sticker to put on it to see it when I close it for storage.

And I decided to keep the Casio as well — not just because it’s handy to have a calculator at home when doing paperwork, even with all the computers around, but also because it might be interesting to see how much of the firmware is downloadable, and whether someone has tried flashing a different model’s firmware onto it, to expand its capabilities: I can’t believe the A-F keys are there just for the sake of variables, my guess is that they are there because the same board/case is used by a higher model that does support hex, and I’d expect that the only thing that makes it behave one way or the other is the firmware — or even just flags in it!

At any rate, expect more information about the Libre 2 later on this month or next. And if I decide to spend more time on the Casio as well, you’ll see the notes here on the blog. But for now I think I want to get at least some of my projects closer to completion.

Windows Backup Solutions: trying out Acronis True Image Backup 2020

One of my computers is my Gamestation, which to be honest has not ran a game in many months now, which runs Windows out of necessity, but also because honestly sometimes I just need something that works out of the box. The main usage of that computer nowadays is Lightroom and Photoshop for my photography hobby.

Because of the photography usage, backups are a huge concern to me (particularly after movers stole my previous gamestation), and so I have been using a Windows tool called FastGlacier to store a copy of most of the important stuff to Amazon Glacier service, in addition to letting Windows 10 do its FileHistory magic on an external hard drive. Not a cheap option, but (I thought) a safe and stable one. Unfortunately the software appears to not being developed anymore, and with one of the more recent Windows 10 updates it stopped working (and since I had set it up as a scheduled operation, it failed silently, which is the worst thing that can happen!)

My original plan for last week (at the time of writing), was to work on pictures, as I have shots from trip over three years ago that I have still not wandered through, rather than working on reverse engineering. But when I noticed the lacking backups, I decided to put that on hold until the backup problem was solved. The first problem was finding a backup solution that would actually work, and that wouldn’t cost an arm and a leg. The second problem was that of course most of the people I know are tinkerers that like rube-goldberg solutions such as using rclone on Windows with the task scheduler (no thanks, that’s how I failed the Glacier backups).

I didn’t have particularly high requirements: I wanted a backup solution that would do both local and cloud backups — because Microsoft has been reducing the featureset of their FileHistory solution, and so relying on it feels a bit flaky. And the ability to store more than a couple of terabytes on the cloud solution (I have over 1TB of RAW shots!), even at a premium. I was not too picky on price, as I know features and storage are expensive. And I wanted something that would just work out of the box. A few review reads later, I found myself trying Acronis True Image Backup. A week later, I regret it.

I guess my best lesson learnt from this is that Daniel is right, and it’s not just about VPNs: most review sites seem to be scoring higher the software they get more money from via affiliate links (you’ll notice that in this blog post there won’t be any!) So while a number of sites had great words for Acronis’s software, I found it sufficiently lacking that I’m ranting about it here.

So what’s going on with the Acronis software? First of all, while it does support both “full image” and “selected folders” modes, you need to be definitely aware that the backup is not usable as-is: you need the software to recover the data. Which is why it comes with bootable media, “survival kits”, and similar amenities. This is not a huge deal to me, but it’s still a bit annoying, when FileHistory used to allow direct access to the files. It also locks you in in accessing the backup with the software, although Acronis makes the restore option available even after you let your subscription expire, which is at least honest.

Then the next thing that was clear to me was that the speed of the cloud backup is not the strongest suit of Acronis. The original estimate for backing up the 2.2TB of data that I expected to back up was on the mark at nearly six days. To be fair to Acronis, the process went extremely smoothly, it never got caught up, looped, crashed, or slowed down. The estimate was very accurate, and indeed, running this for about 144 hours was enough to have the full data backed up. Their backup status also shows the average speed of the processes, that matched my estimate while the backup was running, of 50Mbps.

The speed is the first focus of my regret. 50Mbps is not terribly slow, and for most people this might be enough to saturate their Internet uplink. But not for me. At home, my line is provided by Hyperoptic, with a 1Gbps line that can sustain at least 900Mbps upload. So seeing the backup bottlenecked by this was more than a bit annoying. And as far as I can tell, there’s no documentation of this limit on the Acronis website at the time of writing.

When I complained on Twitter about this, it was mostly in frustration for having to wait, but I was considering the 50Mbps speed at least reasonable (although I would have considered paying a premium for faster uploads!) the replies I got from support have gotten me more upset than before. Their Twitter support people insisted that the problem was with my ISP and sent me to their knowledgebase article on using the “Acronis Cloud Connection Verification Tool” — except that following the instruction showed I was supposed to be using their “EU4” datacenter, for which there is no tool. I was then advise to file a ticket about it. Since then, I appear to have moved back to “EU3” — maybe EU4 was not ready yet.

The reply to the ticket was even more of an absurdist mess. Beside a lot of words to explain “speed is not our fault, your ISP may be limiting your upload” (fair, but I already noted to them that I knew that was not the case), one of the steps they request you to follow is to go to one of their speedtest apps — which returns a 504 error from nginx, oops! Oh yeah and you need to upload the logs via FTP. In 2020. Maybe I should call up Foone to help. (Windows 10, as it happens, still supports FTP write-access via File Explorer, but it’s not very discoverable.)

Both support people also kept reminding me that the backup is incremental. So after the first cloud backup, everything else should be a relatively small amount of data to be copied. Except that I’m not sold onto that either, still: 128GB of data (which is the amount of pictures I came back from Budapest with), would take nearly six hours to back up.

When I finally managed to get a reply that was not directly from a support script, they told me to run the speedtest on a different datacenter, EU2. As it turns out, this is their “Germany” datacenter. This was very clear by tracerouting the IP addresses for the two hosts: EU3 is connected directly to LINX, EU2 goes back to AMS, then FRA (Frankfurt). The speedtest came out fairly reasonable (around 250Mbps download, 220Mbps upload), so I shared the data they requested in the ticket… and then wondered.

Since you can’t change the datacenter you backup to once you started a backup, I tried something different: I used their “Archive” feature, and tried to archive a multi-gigabyte file, but to their Germany datacenter, rather than the United Kingdom one (against their recommendation of «select the country that is nearest to your current location»). Instead of a 50Mbps peak, I got a 90Mbps peak, with a sustained of 67Mbps. Now this is still not particularly impressive, but it would have cut down the six days to three, and the five hours to around two. And clearly it sounds like their EU3 datacenter is… not good.

Anyway, let’s move on and look at local backups, which Acronis is supposed to take care of by itself. For this one at first I wanted to use the full image backup, rather than selecting folders like I did for the cloud copy, since it would be much cheaper, and I have a 9T external harddrive anyway… and when you do that, Acronis also suggests you to create what they call the “Acronis Survival Kit” — which basically means turning the external hard drive bootable, so that you can start up and restore the image straight from it.

The first time I tried setting it up that way, it formatted the drive, but it didn’t even manage to get Windows to connect the new filesystem. I got an error message linking me to a knowledgebase article that… did not exist. This is more than a bit annoying, but I decided to run a full SMART check on the drive to be safe (no error to be found), and then try again after a reboot. Then it finally seemed to work, but here’s where things got even more hairy.

You see, I’ve been wanting to use my 9TB external drive for the backup. A full image of my system was estimated at 2.6TB. But after the Acronis Survival Kit got created, the amount of space available for the backup on that disk was… 2TB. Why? It turned out that the Kit creation caused the disk to be repartitioned as MBR, rather than the more modern GPT. And in MBR you can’t have a (boot) partition bigger than 2TB. Which means that the creation of the Survival Kit silently decreased my available space to nearly 1/5th!

The reply from Acronis on Twitter? According to them my Windows 10 was started in “BIOS mode”. Except it didn’t. It’s set up with UEFI and Secure Boot. And unfortunately it doesn’t seem like there’s an easy way to figure out why the Acronis software thinks it’s that way. But worse than that, the knowledgebase article says that I should have gotten a warning, which I never did.

So what is it going to be at the end of the day? I tested the restore from Acronis Cloud, and it works fine. Acronis has been in business for many years, so I don’t expect them to disappear next year. So the likeliness of me losing access to these backups is fairly low. I think I may just stick to them for the time being, and hope that someone in the Acronis engineering or product management teams can read this feedback and think about that speed issue, and maybe start considering the idea of asking support people to refrain from engaging with other engineers on Twitter with fairly ridiculous scripts.

But to paraphrase a recent video by Techmoan, these are the type of imperfections (particularly the mis-detected “BIOS booting” and the phantom warning), that I could excuse to a £50 software package, but that are much harder to excuse in a £150/yr subscription!

Any suggestions for good alternatives to this would be welcome, particularly before next year, when I might reconsider if this was good enough for me, or a new service is needed. Suggestions that involve scripts, NAS, rclone, task scheduling, self-hosted software will be marked as spam.

USB capturing in 2020

The vast majority of the glucometer devices I reverse the protocol of use USB to connect to a computer. You could say that all of those that I successfully reversed up to now are USB based. Over the years, the way I capture USB packets to figure out a protocol changed significantly, starting from proprietary Windows-based sniffers, and more recently involving my own opensource trace tools. The process evolution was not always intentional — in the case of USBlyzer, it was pretty much dead a few years after I started using it, plus the author refused to document the file format, and by then even my sacrificial laptop was not powerful enough to keep running all the tools I needed.

I feel I’m close to another step on the evolution of my process, and once again it’s not because of me looking to improve the process as much as is the process not working on modern tools. Let me start by explaining what the situation is, because there are two nearly separate issues at play here.

The first issue is that either OpenSuse or the kernel changed the way the debugfs is handled. For those who have not looked at this before, debugfs is what lives in /sys/kernel/debug, and provides the more modern interface for usbmon access; the old method via /dev/usbmonX is deprecated, and Wireshark will not even show up the ability to capture USB packets without debugfs. Previously, I was able to manually change the ownership of the usbmon debugfs paths to my user, and started Wireshark as user to do the capturing, but as of January 2020, it does not seem to be possible to do that anymore: the debugfs mount is only accessible to root.

Using Wireshark as root is generally considered a really bad idea, because it has a huge attack surface, in particular when doing network captures, where the input would literally be to the discretion of external actors. It’s a tinsy bit safer when capturing USB because even when the device is fairly unknown, the traffic is not as controllable, so I would have flinched, but not terribly, to use Wireshark as root — except that I can’t sudo wireshark and have it paint on X. So the remaining alternative is to use tshark, which is a terminal utility that implements the same basics as Wireshark.

Unfortunately here’s the second problem: the last time I ran a lot of captures was when I was working on the Beurer glucometer (which I still haven’t gotten back to, because Linux 5.5 is still unreleased at the time of writing, and that’s the first version that’s not going to go into a reset loop with the device), and I was doing that work from my laptop, and that’s relevant. While the laptop’s keyboard and touchpad are USB, the ports are connected to a different bus internally. Since usbmon interfaces are set by bus, that made it very handy: I only needed to capture on the “ports” bus, and no matter how much and what I typed, it wouldn’t interfere in my captures at all.

You can probably see where this is going: I’m now using a NUC on my desk, with an external keyboard and the Elecom trackball (because I did manage to hurt my wrist while working on the laptop, but that’s a story for another post). And now all the USB 2.0 ports are connected to the same bus. Capturing the bus means getting all the events for keypresses, mouse movements, and so on.

If you have some experience with tcpdump or tshark, you’d think that this is an easy problem to solve: it’s not uncommon having to capture network packets from an SSH connection, which you want to exclude from the capture itself. And the solution for that is to apply a capture filter, such as port not 22.

Unfortunately, it looks like libpcap (which means Wireshark and tshark) does not support capture filters on usbmon. The reasoning provided is that since the capture filters for network are implemented in BPF, there’s no fallback for usbmon that does not have any BPF capabilities in the kernel. I’m not sure about the decision, but there you go. You could also argue that adding BPF to usbmon would be interesting to avoid copying too much data from the kernel, but that’s not something I have particular interest in exploring right now.

So how do you handle this? The suggested option is to capture everything, then use Wireshark to select a subset of packets and save the capture again. This should allow you to have a limited capture that you can share without risking having shared a keylogger off your system. But it also made me think a bit more.

The pcapng format, which Wireshark stores usbmon captures in, is a fairly complicated one, because it can include a lot of different protocol information, and it has multiple typed blocks to store things like hardware interface descriptions. But for USB captures, there’s not much use in the format: not only the Linux and Windows captures (the latter via usbpcap) are different formats altogether, but also the whole interface definition is, as far as I can tell, completely ignored. Instead, if you need a device descriptor, you need to scan the capture for a corresponding request (which usbmon-tools now does.)

I’m now considering just providing a simpler format to store captured data with usbmon-tools, either a simple 1:1 conversion from pcapng, with each packet just size-prefixed, and a tool to filter down the capture on the command line (because honestly, having to load Wireshark to cut down a capture is a pain), or a more complicated format that can store the descriptors separately, and maybe bundle/unbundle them across captures so that you can combine multiple fragments later. If I was in my bubble, I would be using protocol buffers, but that’s not particularly friendly to integrate in a Python module, as far as I can tell. Particularly if you want to be able to use the tools straight out of the git clone.

I guess that since I’m already using construct, I could instead design my own simplistic format. Or maybe I could just bite the bullet, use base64-encoded bytearrays, and write the whole capture session out in JSON.

As I said above, pcapng supports Windows and Linux captures differently: on Linux, the capture format is effectively the wire format of usbmon, while on Linux, it’s the format used by usbpcap. While I have not (yet, at the time of writing) added support to usbmon-tools to load the usbpcap captures, I don’t see why it shouldn’t work out that way. If I do manage to load usbpcap files, though, I would need a custom format to copy these to.

If anyone has a suggestion I’m open to them. One thing that I may try is to use Protocol Buffers but submit the generated source files to parse and serialize the object.

FreeStyle Libre 2: encrypted protocol notes

When I had to write (in a hurry) my comments on Abbott’s DMCA notice to GitHub, I note that I have not had a chance to get my hands onto the Libre 2 system yet, because it’s not sold in the UK. After that, Benjamin from MillionFriends reached out to me to hear if I would be interested in giving a try to figure out how the reader itself speaks with the software. I gladly obliged, and spent most of the time I had during a sick week to get an idea of where we are with this.

While I would lie if I said that we’re now much closer to be able to download data from a Libre 2 reader, I can at least say that we have some ideas of what’s going on right now. So let me try to introduce the problem a second, because there’s a lot of information out there, and while there’s some of it that is quite authoritative (particularly as a few people have been reverse engineering the actual binaries), a lot of it is guesswork and supposition.

The Libre and Libre 2 systems are very similar — they both have sensors, and they both have readers. Technically, you could say that the mobile app (for Android or iOS) also makes up part of the system. The DMCA notice from Abbott that stirred so much trouble above was related to modifications to the mobile application — but luckily for me, my projects and my interests lay quite far away from that. The sensors can be “read” by their respective reader devices, or with a phone with the correct app on it. When reading the sensors with the reader, the reader itself stores the historical data and a bunch more information. You can download that data from the reader onto a computer with Windows or macOS with the official software from Abbott (assuming you can download a version of it that works for you, I couldn’t find a good download page for this on the UK website, but I was provided an EU version of the software as well.)

For the Libre system, I have attempted reversing the protocol, but ultimately it was someone else contributing the details of an usable protocol that I implemented in glucometerutils. For the Libre 2, the discussion already suggested it wouldn’t be the same protocol, and that encryption was used by the Libre 2 reader. As it turns out, Abbott really does not appear to appreciate customers having easy access to their data, or third parties building tools around their devices, and they started encrypting the communication between the sensors and the reader or app in the new system.

So what’s the state with the Libre 2? Well, one of the good news is that the software that I was given works with both the Libre and Libre 2 systems, so I could compare the USB captures on both systems, and that will (as I’ll show in a moment) help significantly. It also showed that the basics of the Abbott HID protocol were maintained: most of the handshake, the message types and the maximum message size. Unfortunately it was clear right away that indeed most of the messages exchanged were encrypted, because the message length made no sense (anything over 62 as length was clearly wrong).

Now, the good news is that I have built up enough interfaces in usbmon-tools that I could use to build a much more reusable extractor for the FreeStyle protocol, and introduce more of the knowledge of the encryption to it, so that it can be used for others. Being able to release these extraction tools I write, instead of just using them myself and letting them rot, was my primary motivation behind building usbmon-tools, so you could say that that’s an achieved target.

So earlier I said that I was lucky the software works with both the Libre and Libre 2 readers. The reason why that was luck, it’s because it shows that the sequence of operations between the two is nearly the same, and that some of the messages are not actually encrypted on the Libre 2 either (namely, the keepalive messages). Here’s the handshake from my Libre 1 reader:

[ 04] H>>D 00000000: 

[ 34] H<<D 00000000: 16                                                . 

[ 0d] H>>D 00000000: 00 00 00 02                                       .... 

[ 05] H>>D 00000000: 

[ 15] H>>D 00000000: 

[ 06] H<<D 00000000: 4A 43 4D 56 30 32 31 2D  54 30 38 35 35 00        JCMV021-T0855. 

[ 35] H<<D 00000000: 32 2E 31 2E 32 00                                 2.1.2. 

[ 01] H>>D 00000000: 

[ 71] H<<D 00000000: 01                                                . 

[ 21] H>>D 00000000: 24 64 62 72 6E 75 6D 3F                           $dbrnum? 

[ 60] H<<D 00000000: 44 42 20 52 65 63 6F 72  64 20 4E 75 6D 62 65 72  DB Record Number
[ 60] H<<D 00000010: 20 3D 20 33 37 32 39 38  32 0D 0A 43 4B 53 4D 3A   = 372982..CKSM:
[ 60] H<<D 00000020: 30 30 30 30 30 37 36 31  0D 0A 43 4D 44 20 4F 4B  00000761..CMD OK
[ 60] H<<D 00000030: 0D 0A                                             .. 

[ 0a] H>>D 00000000: 00 00 37 C6 32 00 34                              ..7.2.4 

[ 0c] H<<D 00000000: 01 00 18 00                                       .... 

Funnily enough, while this matches the sequence that Xavier described for the Insulinx, and that I always reused for the other devices too, I found that most of this exchange is for the original software to figure out which device you connected. And since my tools require you to know which device you’re using, I actually cleaned up the FreeStyle support code in glucometerutils to shorten the initialization sequence.

To describe the sequence in prose, the software is requesting the serial number and software version of the reader (commands 0x05 and 0x15), then initializing (0x01) and immediately using the text command $dbrnum? to know how much data is stored on the device. Then it starts using the “binary mode” protocol that I started on years ago, but never understood.

For both the systems, I captured the connection establishment, from when the device is connected to the Windows virtual machine to the moment when you can choose what to do. The software is only requesting a minimal amount of data, but it’s still quite useful for comparison. Indeed, you can see that after using the binary protocol to fetch… something, the software sends a few more text commands to confirm the details of the device:

[ 0b] H<<D 00000000: 12 3C AD 93 0A 18 00 00  00 00 00 9C 2D EA 00 00  .<..........-...
[ 0b] H<<D 00000010: 00 00 00 0E 00 00 00 00  00 00 00 7C 2A EA 00 00  ...........|*...
[ 0b] H<<D 00000020: 00 00 00 16 00 00 00 00  00 00 00 84 2D EA 00 21  ............-..!
[ 0b] H<<D 00000030: C2 25 43                                          .%C 

[ 21] H>>D 00000000: 24 70 61 74 63 68 3F                              $patch? 

[ 0d] H>>D 00000000: 3D 12 00 00                                       =... 

[ 60] H<<D 00000000: 4C 6F 67 20 45 6D 70 74  79 0D 0A 43 4B 53 4D 3A  Log Empty..CKSM:
[ 60] H<<D 00000010: 30 30 30 30 30 33 36 38  0D 0A 43 4D 44 20 4F 4B  00000368..CMD OK
[ 60] H<<D 00000020: 0D 0A                                             .. 

[ 21] H>>D 00000000: 24 73 6E 3F                                       $sn? 

[ 60] H<<D 00000000: 4A 43 4D 56 30 32 31 2D  54 30 38 35 35 0D 0A 43  JCMV021-T0855..C
[ 60] H<<D 00000010: 4B 53 4D 3A 30 30 30 30  30 33 32 44 0D 0A 43 4D  KSM:0000032D..CM
[ 60] H<<D 00000020: 44 20 4F 4B 0D 0A                                 D OK.. 

[ 21] H>>D 00000000: 24 73 77 76 65 72 3F                              $swver? 

[ 60] H<<D 00000000: 32 2E 31 2E 32 0D 0A 43  4B 53 4D 3A 30 30 30 30  2.1.2..CKSM:0000
[ 60] H<<D 00000010: 30 31 30 38 0D 0A 43 4D  44 20 4F 4B 0D 0A        0108..CMD OK.. 

My best guess on why it’s asking again for serial number and software version, is that the data returned during the handshake is only used to select which “driver” implementation to use, while this is used to actually fill in the descriptor to show to the user.

If I look at the capture of the same actions with a Libre 2 system, the initialization is not quite the same:

[ 05] H>>D 00000000: 

[ 06] H<<D 00000000: 4D 41 47 5A 31 39 32 2D  4A 34 35 35 38 00        MAGZ192-J4558. 

[ 14] H>>D 00000000: 11                                                . 

[ 33] H<<D 00000000: 16 B1 79 F0 A1 D8 9C 6D  69 71 D9 1A C0 1A BC 7E  ..y....miq.....~ 

[ 14] H>>D 00000000: 17 6C C8 40 58 5B 3E 08  A5 40 7A C0 FE 35 91 66  .l.@X[>..@z..5.f
[ 14] H>>D 00000010: 2E 01 37 88 37 F5 94 71  79 BB                    ..7.7..qy. 

[ 33] H<<D 00000000: 18 C5 F6 DF 51 18 AB 93  9C 39 89 AC 01 DF 32 F0  ....Q....9....2.
[ 33] H<<D 00000010: 63 A8 80 99 54 4A 52 E8  96 3B 1B 44 E4 2A 6C 61  c...TJR..;.D.*la
[ 33] H<<D 00000020: 00 20                                             .  

[ 04] H>>D 00000000: 

[ 0d] H>>D 00000000: 00 00 00 02                                       .... 

[ 34] H<<D 00000000: 16                                                . 

[ 05] H>>D 00000000: 

[ 15] H>>D 00000000: 

[ 06] H<<D 00000000: 4D 41 47 5A 31 39 32 2D  4A 34 35 35 38 00        MAGZ192-J4558. 

[ 35] H<<D 00000000: 31 2E 30 2E 31 32 00                              1.0.12. 

[ 01] H>>D 00000000: 

[ 71] H<<D 00000000: 01                                                . 

[x21] H>>D 00000000: 66 C2 59 40 42 A5 09 07  28 45 34 F2 FB 2E EC B2  f.Y@B...(E4.....
[x21] H>>D 00000010: A0 BB 61 8D E9 EE 41 3E  FC 24 AD 61 FB F6 63 34  ..a...A>.$.a..c4
[x21] H>>D 00000020: 7B 7C 15 DB 93 EA 68 9F  9A A4 1E 2E 0E DE 8E A1  {|....h.........
[x21] H>>D 00000030: D6 A2 EA 53 45 2F A8 00  00 00 00 17 CF 84 64     ...SE/........d 

[x60] H<<D 00000000: 7D C1 67 28 0E 31 48 08  2C 99 88 04 DD E1 75 77  }.g(.1H.,.....uw
[x60] H<<D 00000010: 34 5A 88 CA 1F 6D 98 FD  79 42 D3 F2 4A FB C4 E8  4Z...m..yB..J...
[x60] H<<D 00000020: 75 C0 92 D5 92 CF BF 1D  F1 25 6A 78 7A F7 CE 70  u........%jxz..p
[x60] H<<D 00000030: C2 0F B9 A2 86 68 AA 00  00 00 00 F9 DE 0A AA     .....h......... 

[x0a] H>>D 00000000: 9B CA 7A AF 42 22 C6 F2  8F CA 0E 58 3F 43 9C AB  ..z.B".....X?C..
[x0a] H>>D 00000010: C7 4D 86 DF ED 07 ED F4  0B 99 D8 87 18 B5 8F 76  .M.............v
[x0a] H>>D 00000020: 69 50 4F 6C CE 86 CF E1  6D 9C A1 55 78 E0 AF DE  iPOl....m..Ux...
[x0a] H>>D 00000030: 80 C6 A0 51 38 32 8D 01  00 00 00 62 F3 67 2E     ...Q82.....b.g. 

[ 0c] H<<D 00000000: 01 00 18 00                                       .... 

[x0b] H<<D 00000000: 80 37 B7 71 7F 38 55 56  93 AC 89 65 11 F6 7F E6  .7.q.8UV...e....
[x0b] H<<D 00000010: 31 03 3E 15 48 7A 31 CC  24 AD 02 7A 09 62 FF 9C  1.>.Hz1.$..z.b..
[x0b] H<<D 00000020: D4 94 02 C9 5F FF F2 7B  3B AC F0 F7 99 1A 31 5A  ...._..{;.....1Z
[x0b] H<<D 00000030: 00 B8 7B B7 CD 4D D4 01  00 00 00 E2 D4 F1 13     ..{..M......... 

The 0x14/0x33 command/response are new — and they clearly are used to set up the encryption. Indeed, trying to send out a text command without having issued these commands has the reader respond with a 0x33 reply that I interpret as a “missing encryption” error.

But you can also see that there’s a very similar structure to the commands: after the initialization, there’s an (encrypted) text command (0x21) and response (0x60), then there’s an encrypted binary command, and more encrypted binary responses. Funnily enough, that 0x0c response is not encrypted, and the sequence of responses of the same type is very similar between the Libre 1 and Libre 2 captures as well.

The similarities don’t stop here. Let’s look at the end of the capture:

[x0b] H<<D 00000000: A3 F6 2E 9D 4E 13 68 EB  7E 37 72 97 6C F9 7B D6  ....N.h.~7r.l.{.
[x0b] H<<D 00000010: 1F 7B FB 6A 15 A8 F9 5F  BD EC 87 BC CF 5E 16 96  .{.j..._.....^..
[x0b] H<<D 00000020: EB E7 D8 EC EF B5 00 D0  18 69 D5 48 B1 D0 06 A6  .........i.H....
[x0b] H<<D 00000030: 30 1E BB 9B 04 AC 93 DE  00 00 00 B6 A2 4D 23     0............M# 

[x21] H>>D 00000000: CB A5 D7 4A 6C 3A 44 AC  D7 14 47 16 15 40 15 12  ...Jl:D...G..@..
[x21] H>>D 00000010: 8B 7C AF 15 F1 28 D1 BE  5F 38 5A 4E ED 86 7D 20  .|...(.._8ZN..} 
[x21] H>>D 00000020: 1C BA 14 6F C9 05 BD 56  63 FB 3B 2C EC 9E 3B 03  ...o...Vc.;,..;.
[x21] H>>D 00000030: 50 B1 B4 D0 F6 02 92 14  00 00 00 CF FA C2 74     P.............t 

[ 0d] H>>D 00000000: DE 13 00 00                                       .... 

[x60] H<<D 00000000: CE 96 6D CD 86 27 B4 AC  D9 46 88 90 C0 E7 DB 4A  ..m..'...F.....J
[x60] H<<D 00000010: 8D CC 8E AA 5F 1B B6 11  4E A0 2B 08 C0 01 D5 D3  ...._...N.+.....
[x60] H<<D 00000020: 7A E9 8B C2 46 4C 42 B8  0C D7 52 FA E0 8F 58 32  z...FLB...R...X2
[x60] H<<D 00000030: DE 6C 71 3F BE 4E 9A DF  00 00 00 7E 38 C6 DB     .lq?.N.....~8.. 

[x60] H<<D 00000000: 11 06 1C D2 5A AC 1D 7E  E3 4C 68 B2 83 73 DF 47  ....Z..~.Lh..s.G
[x60] H<<D 00000010: 86 05 4E 81 99 EC 29 EA  D8 79 BA 26 1B 13 98 D8  ..N...)..y.&....
[x60] H<<D 00000020: 2D FA 49 4A DF DD F9 5E  2D 47 29 AB AE 0D 52 77  -.IJ...^-G)...Rw
[x60] H<<D 00000030: 2E EB 42 EC 7E CF BB E0  00 00 00 FE D4 DC 7E     ..B.~.........~ 

… Yeah many more encrypted messages …

[x60] H<<D 00000000: 53 FE E5 56 01 BB C2 A7  67 3E A6 AB DB 8E B7 13  S..V....g>......
[x60] H<<D 00000010: 6D F7 80 5C 06 23 09 3E  49 B4 A7 8B D3 61 92 C9  m..\.#.>I....a..
[x60] H<<D 00000020: 72 1D 5A 04 AE E3 3E 05  2E 1B C7 7C 42 2D F8 42  r.Z...>....|B-.B
[x60] H<<D 00000030: 37 88 7E 16 D9 34 8B E9  00 00 00 11 EE 42 05     7.~..4.......B. 

[x21] H>>D 00000000: 01 84 3F 02 36 1E A6 82  E2 C5 BF C2 40 78 B9 CD  ..?.6.......@x..
[x21] H>>D 00000010: E9 55 17 BE E9 16 8A 52  D2 D9 85 69 E4 D5 96 7A  .U.....R...i...z
[x21] H>>D 00000020: 55 6D DF 2E AF 96 36 53  64 C5 C7 D1 B6 6F 1A 1A  Um....6Sd....o..
[x21] H>>D 00000030: 4F 2F 25 FF 58 F4 EE 15  00 00 00 F6 9A 52 64     O/%.X........Rd 

[x60] H<<D 00000000: 19 F4 D4 F0 66 11 E3 CE  47 DE 82 87 22 48 3C 8D  ....f...G..."H<.
[x60] H<<D 00000010: BA 2D C0 37 12 25 CD AB  3A 58 C2 C4 01 88 60 21  .-.7.%..:X....`!
[x60] H<<D 00000020: 15 1E D1 EE F2 90 36 CA  B0 93 92 34 60 F5 89 E0  ......6....4`...
[x60] H<<D 00000030: 64 3C 20 39 BF 4C 98 EA  00 00 00 A1 CE C5 61     d< 9.L........a 

[x21] H>>D 00000000: D5 89 18 22 97 34 CB 6E  76 C5 5A 23 48 F4 5E C6  ...".4.nv.Z#H.^.
[x21] H>>D 00000010: 0E 11 0E C9 51 BD 40 D7  81 4A DF 8A 0B EF 28 82  ....Q.@..J....(.
[x21] H>>D 00000020: 1F 14 47 BC B8 B8 FA 44  59 7A 86 14 14 4B D7 0F  ..G....DYz...K..
[x21] H>>D 00000030: 37 48 CC 1F C5 A2 9E 16  00 00 00 00 A3 EE 69     7H............i 

[x60] H<<D 00000000: 62 33 4B 90 3B 68 3A D1  01 B1 15 4C 48 A1 6E 20  b3K.;h:....LH.n 
[x60] H<<D 00000010: 12 6F BC D5 50 33 9E C3  CC 35 4E C8 46 81 3E 6B  .o..P3...5N.F.>k
[x60] H<<D 00000020: 96 17 DF D5 8C 22 5C 3A  B7 52 C2 D9 37 71 B7 E2  ....."\:.R..7q..
[x60] H<<D 00000030: 5F C4 88 81 2A 91 65 EB  00 00 00 69 E2 A8 DE     _...*.e....i... 

These are once again text commands. In particular one of them gets a response that is long enough to span multiple encrypted text responses (0x60). Given that the $patch? command on the Libre 1 suggested it’s a multirecord command, it might be that the Libre 2 actually has a long list of patches.

So my best guess of this is that, aside for the encryption, the Libre 2 and Libre 1 systems are actually pretty much the same. I’m expecting that the only thing between us and being able to download the data out of a Libre 2 system is to figure out the encryption scheme and whether we need to extract keys to be able to do so. In the latter case that is something we should proceed carefully with, because it’s probably going to be the way Abbott is going to enforce their DMCA requests.

What do we know about the encryption setup, then? Well, I had a theory, but then it got completely trashed. I still got some guesses that for now appear solid.

First of all, the new 0x14/0x33 command/reply: this is called multiple time by the software, and the reader uses the 0x33 response to tell you either the encryption is missing or wrong, so I’m assuming these commands are encryption related. But since there’s more than one meaning for these commands, it looks like the first byte for each of these selects a “sub-command”.

The 0x14,0x11 command appears to be the starting point for the encryption; the device responds with what appears to be 15 random bytes. Not 16! The first byte again appears to be a “typing” specification and is always 0x16. You could say that the 0x14,0x11 command gets a 0x33,0x16 response. In the first three captures I got from the software, the device actually sent exactly the same bytes. Then it started giving a different response for each time I called it. So I guess it might be some type of random value, that needs some entropy to be re-generated. Maybe it’s a nonce for the encryption?

The software then sends a 0x14,0x17 command, which at first seemed to have a number of constant bytes in positions, but now I’m not so sure. I guess I need to compare more captures for that to be the case. But because of the length, there’s at most 25 bytes that are sent to the device.

The 0x33,0x18 response comes back, and it includes 31 bytes, but the last two appear to be constant.

Also if I compare the three captures, two that received the same 0x33,0x16 response, and one that didn’t, there are many identical bytes between the two with the same response (but not all of them!), and very few with the third one. So it sounds like either this is a challenge-response that uses the provided nonces, or it actually uses that value to do the key derivation.

If you’re interested in trying to figure out the possible encryption behind this, the three captures are available on GitHub. And if you find anything else that you want to share with the rest of the people looking at this, please let us know.

Unnecessary, but required

In the past year, I’ve hard to learn quite a few different lessons, some harder than others, some more gratifying than others. One of the main (but far from the only) source of these lessons was learning to live with someone else — save for my mother, and a few months with Luca, I have never really shared an apartment, a flat, or a house with someone else for more than a few days. But now that I’m happily married, there’s no going back to solitude. And it’s a feeling I’m really happy about, despite the eventual challenges that this has brought to both of us.

One of the differences that we realised early on is that we have different tolerances to chaos and trinkets. I’m not particularly organised when it comes to sorting out my stuff, but I’m also not a total slob — but I don’t mind having items spread across three rooms, and I was not particularly well known for having ironed t-shirts. My wife’s much less… chaotic, but at the same time has a fairly short patience for technology for the sake of technology.

This pretty much makes a dent in the amount of random gadgets I end up buying for the sake of trying out, because they might just end up not being used, or even not being welcome if they somehow get in the way. I think my most impressive achievement has been making her accept we have an electric cheese grater. I’m still trying to convince her it’s a good idea for me to disassemble the battery charger to replace the current plug-in adapter with an micro-USB port. Which is honestly not necessary at all: the plug is an AC-DC adapter, europlug with one of those europlug-to-british screw-in adapters, which means if we decide to leave London for the Continent, we won’t be needing to replace it — it would only become an issue if we moved to a different part of the world, and we can address it then.

But at the same time, this is the type of modification that in my eyes is… well, required. Why would I not make my electric cheese grater into an USB-powered electric cheese grater?

This reminded me of what Adam Savage (of Mythbuster fame) says in his biography Every Tool’s A Hammer (which, incidentally, is an awesome read that I would recommend everyone who has even a passing interest in creating stuff):

I often describe myself as a serial skill collector. I’ve had so many different jobs over my lifetime […] that my virtual tool chest is overflowing. Still I love learning new ways of thinking and organizing, new technqiues, new ways of solving old problems. […] The skills I have, all of them, are simply arrows in my mental quiver, tools in my problem-solving tool chest, to achieve that thing. […] And I learned each of them specifically for that reason. […] Eventually, […] I came to realise this was the ONLY way I could successfully learn a skill—by doing something with it, by applying it in my real world.

Adam Savage, Every Tool’s A Hammer

This is pretty much my life. I have pretty clearly failed at learning things “academically”, lasting only a few weeks at University of Venice, and instead building up my knowledge by working on different projects, both opensource and for customers, and by trying things out for myself. This has been a blessing and a curse at same time: while it meant that I have been collecting a bunch of skills, just like Adam is saying above, for the most part I have superficial skills: I’ve only rarely had to go deep-dive into a technology or a problem in my dayjob, and the amount of time I have to spend on side projects has been fairly low, and shrinking.

Long are the the days gone when I could sit down to write a stupid IRC bot in Qt, just because I could, and not just for the lack of time. It’s also because, for the most part, I keep telling myself it’s a bad idea to work on something low level, when someone else already did it better than I could possibly do — which is likely true, but it fails to meet my requirement to add the skill to my repertoire. And that’s by itself a career-limiting move, comparable to to the bubble problem.

With these issues in mind, I’m definitely glad my wife is understanding on why I sometimes spend money, time, effort (or most likely, all three) just to get something done because I want to, and not because there’s much need for it. It’s unnecessary, but required for me to keep up to scratch. And being able to do that, without upsetting my partner despite the chaos it creates, is a significant privilege.

As well as privilege is being able to afford the time, space, and money for all these projects. I think this is, for the most part, something that is not quite clear out there yet: being able to contribute to opensource, to write up tips and tricks, to document how to do things are privileges. And I think it’s important to share this privilege, even in form of tips, tricks, videos, and blogs — which is why this blog is still existing, and even with ever-shortening spare time I try to write updates.

Whether it is Bigclive on YouTube, with sometimes off-colour comments that make me uncomfortable, or Adam Savage’s own Tested, that can rely on a real, professional shop, or Micah’s most awesome electronics reverse engineering channel, or Foone’s Twitter feed, I am very glad for those who do their best to share knowledge — and I don’t really need to know why they are doing it. Even when it doesn’t really help me directly (because I can’t learn something if I don’t try myself), I know it can help someone else. Or inspire someone else (or in some cases, me) to go and try something, that will make them learn more.

Abbott, the Libre 2, and the takedown

A few people today messaged and mentioned me on twitter regarding the news that Abbott has requested the takedown of something related to their Libre 2. I gave a quick hot take on this on Twitter, but I guess it’s worth having something in long form to be referenced, since I’m sure this will be talked about a lot more, not least because of the ominous permalink chosen by Boing Boing (“they-literally-own-you”) and the fact that, game of telephone style, the news went from the original takedown, to Reddit phrasing it as “Abbott asserts copyright on your data”, which is both silly and untrue.

So let’s start with a bit of background, that most of the re-posters of this story probably don’t know much about. The Libre 2 is an upgrade on the FreeStyle Libre system that I wrote a lot about and that I use daily. It comes with both a reader device and with support in the LibreLink app for both Android and (on more recent iPhones) iOS. The main difference with the Libre system is that the sensors provide both NFC and BLE capabilities, with the ability to proactively notify of high- or low-blood sugar conditions, that the old NFC-only sensors cannot provide, which is more similar to CGM solutions like Dexcom‘s.

In both the Libre and Libre 2 systems, the sensors don’t report blood sugar values, like in most classic glucometers. Instead they report a number of “raw” values, including from a number of temperature sensors. There’s a great explanation of these from Pierre Vandevenne, here and here. To get a real blood sugar measurement, you need to apply some algorithm, that Abbott still refines. The algorithm is what I usually refer to as “secret sauce”, and is implemented in both the reader’s firmware and the LibreLink app itself.

Above I used the word “something” to refer to what was taken down. The reason why I say that is that Boing Boing in the title straight up calls this a “tool” — but when you read the linked post from the affected person, it is described as “details of how to patch the LibreLink app”. Since I have not seen what the repository was before it was taken down, I have no idea which one to believe exactly. In either case, it looks like Abbott does not like someone to effectively leverage their “secret sauce” to use in a different application, but in particular, it does not look like we’re talking about something like glucometerutils, that implemented the protocol “clean”, without derivation off the original software.

Indeed, Boing Boing seems to make a case that this is equivalent of implementing a file format: «[…] just because Apple’s Pages can read Word docs, it doesn’t mean that Pages is a derivative of MS Office.» Except that it’s not as clear cut. If you implemented support for one format by copying the implementation code into your software, that actually would make it a derivative work, quite obviously. In this case, if I am to believe the original report instead, the taken down content were instructions to modify Abbott’s app — and not a redistribution of it. Since I’m not a lawyer, I have no idea where that stands, but it’s clearly not as black-and-white as Boing Boing appears to make it.

As I said on twitter, this does not affect either of my projects, since neither is relying on the original software, and are rather descriptions of the protocols. They also don’t include any information or support for the Libre 2, since the protocol appears to have changed. There’s an open issue with discussion, but it also appears that this time Abbott is using some encryption on the protocol. And that might be an interesting problem, as someone might have to get up close and personal with the code to figure that part out — but if that’s the case, we’re back at needing a clean-room design for implementing it.

I also want to quote Pierre explicitly from the posts I linked above:

[…] in the Libre FRAM, what we are seeing is a real “raw” signal. While the measure of the glucose signal itself is fairly reliable, it is heavily post-processed by the Libre firmware. Specifically – and in no particular order – temperature compensation, delay compensation, de-noising… all play a role. That understanding and, to some extent, my MD training, led me to extreme caution and prevented me from releasing my “solution”, which I knew to be both incomplete and unable to handle some error conditions.

The main driver behind my decision was the well known “first do no harm” (primum non nocere) motto, an essential part of the Hippocratic Oath which I symbolically took. I still stick by it today. […]

[…]

Today, there are a lot of add-on devices that aim to transform the Libre into a full CGM. To be honest, in general, I do not like either the results they provide or their (in)convenience. None of those I have tried delivered results that would lead to an approval by a regulatory agency, none of them were stable for long periods of time. But, apparently, patients still feel they are helpful and there is now a thriving community that aims at improving them.

Pierre Vandevenne

While I have not sworn a Hippocratic Oath myself, I have similar concerns to Pierre, and I have explicitly avoided documenting the sensors’ protocol, and I won’t be merging code that tries to read them directly, even if provided.

And when it comes to copyright issues, I do weigh them fairly heavily: they are the fundamental way that Free Software even works, by respecting licenses. So I will prefer someone to provide me with the description of Abbott’s encryption protocol, rather than an implementation of it where I may be afraid of a “poisonous tree.”

Working in a bubble, contributing outside of it

The holiday season is usually a great time for personal projects, particularly for people like me who don’t go back “home” with “the family” — quotes needed, since for me home is where I am (London) and where my family is (me and my wife.) Work tends to be more relaxed – even with the added pressure of completing the OKRs for the quarter, and to define those for the next – and given that there is no public transport going on, the time saved in commuting also adds up to an ideal time to work on hobbies.

Unfortunately, this year I’m feeling pretty useless on this front, and I thought this uselessness feeling is at least something I can talk about for the dozen-or-so remaining readers of this blog, in an era of social media and YouTube videos. If this sounds very dismissive, it’s probably because that is the feeling of irrelevancy that took over me, and something that I should probably aim to overcome in 2020, one way or another.

If you are reading this post, it’s likely that you noticed my FLOSS contributions waning and pretty much disappearing over the past few years, except for my work around glucometerutils, and the usbmon-tools package (that kind-of derives off it.) I have contributed the odd patch to the Linux kernel, and more recently to some of the Python typing tooling, but those are really drive-by contributions as I found time for.

Given some of the more recent Twitter threads on Google’s policies around open source contributions, you may wonder if it is related to that, and the answer is “not really”. Early on, I was granted an IARC approval for me to keep working on unpaper (which turned out possibly overkill), for the aforementioned glucometerutils, and for some code I wrote while reverse engineering my gaming mouse. More recently, I’ve leveraged the simplified patching policy, and granted approval for releasing both usbmon-tools and tanuga (although the latter is only released as a skeleton right now.)

So I have all the options, and all the opportunities, to contribute FLOSS projects while in employment of a big multinational Internet company. Why don’t I do that more, then? I think the answer is that I work in a bubble for most of the day, and when I try to contribute something on my spare time, I find myself missing the support structure that the bubble gives me.

I want to make clear here that I’m not saying that everything is better in the bubble. Just that the bubble is soft and warm enough that makes the world outside of it scary, sometimes annoying, but definitely more vast. And despite a number of sensible tools being available out there (and in many cases, better tools), it takes a significant investment in researching the right way to do something, to the point that I suffer from CBA syndrome.

The basic concepts are not generally new: people have talked out loud at conferences about the monorepo, my friend Dinah McNutt spoke and wrote at length about Rapid, the release system we use internally, and that drives the automatic releases, and so on. If you’re even more interested in the topic, this March the book Software Engineering at Google will be released by O’Reilly. I have not read it myself, but I have interacted on and off with two of the curators and I’m sure it’s going to be worth its weight in gold.

Some of the tools are also being released, even if sometimes in modified ways. But even when they are, the amount of integration you may have internally is lost when trying to use them outside. I have considered using Bazel for glucometerutils in the past — but in addition to be a fairly heavy dependency, there’s no easy way to reference most of the libraries that glucometerutils need. At the end of the day, it was not worth trying to use it, despite making my life easier by reducing the cognitive load of working on opensource projects in my personal time.

Possibly the main “support beam” of the bubble, though, is the opinionated platform, which can be seen from the outside in form of the style guides but extends further. To keep the examples related to glucometerutils, while the tests do use absl‘s parameterized class, they are written in a completely different style than I would do at work, and they feel wrong when it comes to importing the local copy of the module to test it. When I looked around to figure out what’s the best practice to write tests in Python, I could find literally dozens of blog posts, StackOverflow answers, documentation for testing frameworks, that all gave slightly different answers. In the bubble you have (pretty much) one way to write the basic test — and while people can be creative even within those guidelines, creativity is usually frown upon.

The same is true for release engineering. As I noted and linked above, all of the release grunt work is done by the Rapid tool in the bubble — and for the most part it’s automated. While there’s definitely more than one way to configure the tool, at least you know which tool to use. And while different teams have often differing opinions on those configurations, you can at least find the opinion of your team, or the closest team to you with an Opinion (with the capital O) and follow that — it might not be perfect for your use, but if it’s allowed it usually means it was reviewed and vouched for (or copy-pasted from something else that was.)

An inside joke from the Google bubble is that the documentation is always out of date and never to be trusted. Beside the unfairness of the joke to the great tech writers I had pleasure to work with, who are more than happy to make sure the documentation is not out of date (but need to know that’s the case, and most of them don’t find out until it’s too late), the truth is that at least we do have documentation for most processes and tools. The outside world has tons of documentation, and some of it is out of date, and it’s very hard to tell whether it’s still correct and valid.

Trying to figure out how to configure a CI/CD tool for a Python project on GitHub (or worse, trying to figure out how to make it release valid packages on PyPI!) still feels like going by the early 2000s HOWTOs, where you hope that the three years old description of the XFree86 configuration file is still matching the implementation (hint: it never did.) Lots of the tools are not easy to integrate, and opting into them takes energy (and sometimes money) — the end result of which is that despite me releasing usbmon-tools nearly a year ago, you still need an unreleased dependency, as the fix I needed for it is not present in any released version, and I haven’t dared bothering the author to ask for a new release yet.

It’s very possible that if I was not working in a bubble all of these issues wouldn’t be be big unknowns — probably if I spend a couple of weeks reviewing the various options for CI/CD I can come up with a good answer for setting up automated releases, and then I can go to the dependency’s author and say “Hey, can I set this up for you?” and that would solve my problem. But that is time I don’t really have, when we’re talking about hobby projects. So I end up opening up the editor in the Git repository I want to work on, add a dozen line or so of code to something I want to do, and figure out that I’m missing the tool, library, interface, opinion, document, procedure that I need, feel drained, and close the editor without having committed – let alone pushed – anything.