Investigating Chinese Acrylic Lamps

A couple of months ago I built an insulin reminder light, roughly hacking around what I would call an acrylic lamp. The name being a reference to the transparent acrylic (or is it polycarbonate?) shape that you fit on top, and that lights up with the LEDs it’s resting on top. They are totally not a new thing, and even Techmoan looked at them three years ago. The relatively simple board inside looked fairly easy to hack around, and I thought it would make a good hack project to look more into them.

They are also not particularly expensive. You can go on AliExpress and get them for a few bucks each with so many different shape designs. There’s different “bases” to choose from, too — the one I hacked the Pikachu on was a fairly simple design with a translucent base, and no remote control, although the board clearly showed space for a TSOP-style infrared decoder. So I ended up ordering four different variants — although all of them without remotes because that part I didn’t particularly care for: one translucent base, one black base with no special features, one with two-colour shapes and LEDs, one one self-changing LEDs with mains power only.

While waiting for those to turn up, I also found a decent deal on Amazon on four bases without the acrylic shapes on them for about £6 each. I took a punt and ordered them, which turned out to be a better deal than expected.

These bases appear to use the same board design, and the same remote control (although they shipped four remotes, too!), and you can see an image of it on the right. This is pretty much the same logic on the board as the one I hacked for my insulin reminder, although it has slightly different LEDs, which are not common anode in the package, but are still wired in a common-anode configuration.

For both the boards, the schema above is as good a reversing as I managed on my own. I did it on the white board, so there might be some differences in the older green one, particularly in the number of capacitors, but all of that is not important for what I’m getting to right now. I shortened the array to just four LEDs to show, but this goes on for all of the others too. The chip is definitely not a Microchip one, but it fits the pinout, so I kept that one, similarly to what I did for the fake candle. Although in this case there’s no crystal on the board, which suggests this is a different chip.

I kind of expected that all the remaining boards would be variation on the same idea, except for the multi-color one, but I was surprised to figure out that only two of them shared the same board design (but took different approaches as to how to connect the IR decoder — oh yeah, I didn’t select any of the remote-controlled lamps, but two of them came with IR decoderes anyway!)

The first difference is due to the base itself: there’s at least two types of board that relate to where the opening for the microUSB port is in relation to the LEDs: either D-shaped (connector inline with the LEDs) or T shaped (connector perpendicular to the LEDs). Another difference is in the placement of the IR decoder: on most of the bases, it’s at 90° from the plug, but in at least one of them it’s direct opposite.

Speaking of bases, the one that was the most different was the two-colours base: it’s quite smaller in size, and round with a smooth finish, and the board was proper D shaped and… different. While the LEDs were still common-anode and appeared wired together, each appears to have its own paired resistor (or two!), and the board itself is double-sided! That was a surprise! It also is changing the basic design quite a bit more than I expected, including only having one Zener, and powering up the microcontroller directly over 4.5V instead of using a 3V regulator.

It also lacks the transistor configuration that you’d find on the other models, which shouldn’t surprise, given how it needs to drive more than the usual three channels. Which actually had me wonder: how does it drive two sets of RGB LEDs with an 8-pin microcontroller? Theoretically, if you don’t have any inputs at all, you could do it: VDD and VSS take two pins, each set of LEDs take three pins for the three colour channels. But this board is designed to take an IR decoder for a remote control, which is an input, and it comes with a “button” (or rather, a piece of metal you can ground with your finger), which is another input. That means you only have four lines you can toggle!

At first I thought that the answer was to be found on the other six-pin chip on the lift, but turns out that’s not the case. That one is marked 8223LC and that appears to correspond to a “touch controller” Shouding SD8223L and is related to the metal circlet that all of these bases use as input.

Instead, the answer became apparent when using the multimeter in continuity mode: since it provides a tiny bit of current, you can turn on LEDs by pointing them between anode and cathode of the diode. Since the RGB cathode on the single LED package are already marked on the board, that’s also not difficult to do, and while doing that I found their trick: the Blue cathods are common to all 10 LEDs, they are not separate for outer and inner groups, and more interestingly the Green cathodes are shorted to the anodes for the inner four LEDs — that means that only the outer LEDs have the full spectrum of colours available, and the only colour combination that make the two groups independent is Green/Red.

So why am I this interested in these particular lamps? Well, they seem to be a pretty decent candidate to do some “labor of love” hack – as bigclive would call it – with making them “Internet of Things” enabled: there’s enough space to fit an ESP32 inside, and with the right stuff you should be able to create a lamp that is ESPHome compatible — or run MicroPython on it, either to reimplement the insulin reminder logic, or something else entirely!

A size test print of my custom designed PCB.

Indeed, after taking a few measurement, I decided to try my hand at designing a replacement board that fits the most bases I have: a D-shaped board, with the inline microUSB, has just enough space to put an ESP32 module on it, while keeping the components on the same side of the board like in the original models. And while the ESP32 would have enough output lines to control at least the two group of LEDs without cheating, it wouldn’t have enough to address normal RGB LEDs individually… but that doesn’t need to stop a labor of love hack (or an art project): Adafruit NeoPixel are pretty much the same shape and size, and while they are a bit more expensive than the regular RGB LEDs they can be individually addressed easily.

Once I have working designs and code, I’ll be sharing, particularly in the hopes that others can improve on them. I have zero designing skills when it comes to graphics or 3D designing, but if I could, I would probably get to design my own base as well as the board: with the exception of the translucent ones, the bases are otherwise some very bland black cylinders, and they waste most of the space to allow 3×AAA batteries (which I don’t think would last for any amount of time). Instead, a 3D printed base, with hooks to hold it onto a wall (or a door) and a microUSB-charged rechargeable battery, would be a lovely replacement for the original ones. And if we have open design for the board, there’s pretty much no need to order and hope for a compatible base to arrive.

FreeStyle Libre 2 More Encryption Notes

Foreword: I know that I said I wouldn’t put reverse engineering projects as part of the Monday schedule, but I find myself having an unbalance between the two set of posts, and I wanted to get this out sooner rather than later, in the hope someone else can make progress.

You may remember I have been working on the FreeStyle Libre 2 encrypted communication protocol for a few months. I have actually taken a break from my Ghidra deep dive while I tried sorting my future out – and failing, thanks to the lockdown – but I got back to this a couple of weeks ago, since my art project completed, and I wanted to see if sleeping it over a bit meant getting a clearer view of it.

Unfortunately, I don’t think I’m any closer to figuring out how to speak to Libre 2 readers. I did manage to find some more information about the protocol, including renaming one of the commands to match the debug logs in the application. I do have some more information about the encoding though, which I thought I would share with the world, hoping it will help the next person trying to get more details on this — and hoping that they would share it with the world as well.

While I don’t have a final answer on what encryption they use on the Libre 2, I do have at least some visualization of what’s going on in the exchange sequence.

There’s 15 bytes sent from the Libre 2 reader to the software. The first eight are the challenge, while the other seven look like a nonce of some kind, possibly an initialization vector, which is used in the encryption phase only.

To build the challenge response, another eight bytes are filled with random returned by CryptGenRandom, which is a fairly low level, and deprecated, API. This is curious, given that the software itself is using Qt for the UI, but makes more sense when you realise that they use the same exact code in the driver used for uploading to the LibreView service, which is not Qt based. It also likely explains why the encryption is not using the QtCryptography framework at all.

This challenge response is then encrypted with a key — there are two sets of keys: Authorization keys are used only for this challenge phase, and Session keys are used to handle the rest of the communication. Each set includes an Encryption and a MAC key. The Authorization keys are both seeded with just the serial number of the device in ASCII form, and two literal strings, as pictured above: AuthrEnc and AuthrMAC. The session keys’ seeds include a pair of 8-bytes values as provided by the device after the authorization completes.

The encryption used is either a streaming cipher or a 64-bit block cipher. I know that, because I have multiple captures from the same device in which the challenge started with the same 8 bytes (probably because it lacked enough entropy to be properly random at initialization time), and they encrypted to exactly the same output bytes. Since the cleartext adds a random component, if it was a 128-bit block cipher, you would expect different ciphertext in the output — which kind of defeats the purpose of those 8 random bytes I guess?

The encrypted challenge response is then embedded in the response message, which includes four constant bytes (they define the message type, the length, and the subcommand, plus an extra constant byte thrown in), and then processed by the MAC algorithm (with the Authorization MAC key) to produce a 64-bit MAC, that is tackled at the end of the message. Then the whole thing is sent to the device, which will finally start answering.

As far as I can tell, the encryption algorithm is the same for Authorization and Session — with the exception of the different seed to the key generation. It also includes a different way to pass a nonce — the session encryption includes a sequence number, on both the device and the software, which is sent in clear text and fed into the encryption (shifted left by 18 bits, don’t ask me!) In addition to the sequence number, the encrypted packets have an unencrypted MAC. This is 4 bytes, but it’s actually done with the same algorithm as the authorization. The remaining 4 bytes are just dropped on the floor.

There’s a lot more that I need to figure out in the code, because not knowing anything about cryptography (and also not being that good with Ghidra). I know that the key generation and the encryption/decryption functions are parameterized with an algorithm value, which likely corresponds to an enum from the library they used. And that the parameterized functions dispatch via 21 objects (but likely not C++ objects, as they don’t seem to use vtables!), which can either point at a common function that returns an error (pretty much “not implemented”) or to an actual, implemented function — the functions check something: the enum in the case of key creation (which is, by the way, always 9), or some attribute of the object passed in for encryption and decryption.

These are clearly coming from a library linked in statically — I can tell because the code style for these is totally different from any other part of Abbott’s code, and makes otherwise no sense. It also is possibly meant to be obfuscated, or at least made it difficult — it’s not the same object out of the 21 that can answer the encrypt/decrypt function for the object, which makes it difficult to find which code is actually being executed.

I think at this point, the thing that is protecting their Libre 2 protocol the most is just the sheer amount of different styles of code in the binary: Qt, C++ with STL, C++ with C-style arrays, Windows APIs, this strange library, …

By the way, one thing that most likely would help with figuring this out would be if we could feed selected command streams into the software. While devices such as the Facedancer can help, given that most of this work is done in virtual machines, I would rather have my old idea implemented. I might look for time to work on this if I can’t find anyone interested, but if you find that this is an useful idea, I would much prefer being involved but not leading its implementation. Honestly, if I had more resources available, I would probably just pay someone to implement it, rather than buy a hardware Facedancer.

FreeStyle Libre 2: Notes From The Deep Dive

As I wrote last week, I’ve started playing with Ghidra to dive into the FreeStyle Libre 2 software, to try and figure out how to speak the encrypted protocol, which is in the way to access the Libre 2 device as we already access the Libre 1.

I’m not an expert when it comes to binary reverse engineering — most of the work I’ve done around reverse engineering has been on protocols that are not otherwise encrypted. But as I said in the previous post, the binary still included a lot of debug logs. In particular, the logs included the name of the class, and the name of the method, which made it fairly easy to track down quite a bit of information on how the software works, as well as the way the protocols work.

I also got lucky to find a second implementation of their software protocol. At least a partial one. You see, there’s two software that can communicate with the old Libre system: the desktop software that appears to be available in Germany, Australia, and a few other countries, and the “driver” for LibreView, a service that allows GPs, consultants, and hospitals to remotely access the blood sugar readings of their patients. (I should write about it later.) While the main app is a single, mostly statically linked Qt graphical app, the “driver” is composed of a number of DLL modules, which makes it much easier to read.

Unfortunately it does not appear to support the Libre 2 and its encryption, but it does help to figure out other details around the rest of the transport protocol, since it’s much better logged, and provides clearer view of the function structure — it seems like the two packages actually come from the same codebase, as a number of classes share the same name between the two implementations.

The interesting part is trying to figure out what the various codenames mean. I found the names Orpheus and Apollo in the desktop app, and I assumed the former was the Libre and the latter the Libre 2, because the encryption is implemented only on the Apollo branch of the hierarchy, in particular in a class called ApolloCryptoLib. But then again, in the “driver” I found the codenames Apollo and Athena — and since the software says it supports the “Libre Pro” (which as far as I know is the US-only version that was released a few years ago), I’m wholly confused on what’s what now.

But as I said, the software does have parallel C++ class hierarchies, implementing lower-level and higher-level access controls for the two codenames. And because the logs include the class name it looks like most functions are instantiated twice (which is why I found it easier to figure out the flow for the non-crypto part from the “driver” module.) A lot of the work I’m doing appears to be manual vtable decoding, since there’s a lot of virtual methods all around.

What also became very apparent is that my hunch was right: the Libre 2 system uses basically the same higher level protocol as the Libre 1. Indeed, I can confirm not only that the text commands sent are the same (and the expected responses are the same, as well), but also that the binary protocol is parsed in the same way. So the only obstacle between glucometerutils and the Libre 2 is the encryption. Indeed, it seems like all three devices use the same protocol, which is either called Shazam, AAP or ATP — it’s not quite clear given the different set of naming conventions in the code, but it’s still pretty obvious that they share the same protocol, not just the HID transport, but also for defining higher level commands.

Now about the encryption, what I found from looking at the software is that there are two sets of keys that are used. The first is used in the “authentication” phase, which is effectively a challenge-response between the device and the software, based on the serial number of the device, and the other is used in the encrypted communication. This was fairly easy to spot, because one of the classes in the code is named ApolloCryptoLib, and it included functions with names like Encrypt, Decrypt, and GenerateKeys.

Also one note that important: the patch (sensor) serial number is not used for the encryption of the reader’s access. This is something that comes up time and time again. Indeed at least a few people have been telling me on Twitter that the Libre 2 sensors (or patches, as Abbott calls them) are also encrypted and that clearly they use the same protocol for the reader. But that’s not the case at all. Indeed, the same encryption happens when no patch was ever initialized, and the information on the patches is fetched from the reader as the last part of the initialization.

Another important piece of information that I found in the code is that the encryption uses separate keys for encryption and MAC. This means that there’s an actual encryption transport layer, similar to TLS, but not similar enough to worry me so much regarding the key material present.

With the code at hand, I also managed to confirm my original basic assumptions about the initialization using sub-commands, where the same message type is sent with a follow-up bytes including information on the command. The confirmation came from a log message calling the first byte in the command… subcmd. The following diagram is my current best understanding of the initialization flow:

Initialization sequence for the FreeStyle Libre 2 encryption protocol.

Unfortunately, most of the functions that I have found related to the encryption (and to the binary protocol, at least in the standalone app) ended up being quite complicated to read. At first I thought this was a side effect of some obfuscation system, but I’m no longer sure. It might be an effect of the compile/decompile cycle, but at least on Ghidra these appear as huge switch blocks, with what is effectively a state machine jumping around, even for the most simple of the methods.

I took a function that is (hopefully) the least likely to get Abbott upset for me reposting it. It’s a simple function: it takes an integer and returns an integer. I called it int titfortat(int) because it took me a while to figure out what it was meant to do. It turns out to normalize the input to either 0, 1 or -1 — the latter being an error condition. It has an invocation of INT3 (a debugger trap), and it has the whole state machine construct I’ve seen in most of the other functions. What I found about this function is that it’s used to set a variable based on whether the generated keys are used for authentication or session.

The main blocker for me right now to figure out how the encryption is working, is that it looks like there’s an array of 21 different objects, each of which comes with what looks like a vtable, and only partially implemented. It does not conform to the way Visual C++ is building objects, so maybe it’s a static encryption library linked inside, or something different altogether. The functions I can reach from those objects are clearly cryptography-related: they include tables for SHA1 and SHA2 at least.

The way the objects are used is also a bit confusing: an initialization function appears to assign to each pointer in the array the value returned by a different function — but each of the functions appear to only return the value of a (different) global. Whenever the vtable-like is not fully implemented, it appears to be pointing at code that simply return an error constant. And when the code is calling those objects, if an error is returned it skips the object and go to the next.

On the other hand, this exercise is giving me a lot of insights about the insight of the overall HID transport as well as the protocol inside of it. For example, I finally found the answer to which checksum the binary messages include! It’s a modified CRC32, except that it’s calculated over 4-bit at a time instead of the usual 8, and thus requires a shortened lookup table (16 entries instead of 256) — and if you think that this is completely pointless, I tend to agree with you. I also found that some of the sub-commands for the ATP protocol include an extra byte before the actual sub-command identifier. I’m not sure how those are interpreted yet, and it does not seem to be a checksum, as they are identical for different payloads.

Anyway, this is clearly not enough information yet to proceed with implementing a driver, but it might be just enough information to start improving the support for the binary protocol (ATP) if the Libre 2 turns out not to understand the normal text commands. Which I find very unlikely, but you we’ll have to see.

Leveling up my reverse engineering: time for Ghidra

In my quest to figure out how to download data from the Abbott FreeStyle Libre 2, I decided that figuring it out just by looking at the captures was a dead end. While my first few captures had me convinced the reader would keep sending the same challenge, and so I could at least replay that, it turned out to be wrong, and that excluded most of the simplest/silliest of encryption schemes.

As Pierre suggested me on Twitter, the easiest way to find the answer would be to analyze the binary itself. This sounded like a hugely daunting task for myself, as I don’t speak fluent Intel assembly, and particularly I don’t speak fluent Windows interfaces. The closest I got to this in the past has been the reverse engineering of the Verio, in which I ran the software on top of WinDbg and discovered, to my relief, that they not just kept some of the logging on, but also a whole lot of debug logs that made it actually fairly easy to reverse the whole protocol.

But when the options are learning enough about cryptography and cryptanalysis to break the encoding, or spend time learning how to reverse engineer a Windows binary — yeah I thought the latter would suit me better. It’s not like I have not dabbled in reversing my own binaries, or built my own (terrible) 8086 simulator in high school, because I wanted to be able to see how the registers would be affected, and didn’t want to wait for my turn to use the clunky EEPROM system we used in the lab (this whole thing is probably a story for another day).

Also, since the last time I considered reversing a binary (when I was looking at my laptop’s keyboard), there’s a huge development: the NSA released Ghidra. For those who have not heard, Ghidra is a tool to reverse engineer binaries that includes a decompiler, and a full blown UI. And it’s open source. Given that the previous best option for this was IDA Pro, with thousands of dollars of licenses expected, this opened a huge amount of doors.

So I spent a weekend in which I had some spare time to try my best on reversing the actual code coming from the official app for the Libre 2 — I’ll provide more detail of that once I understand it better, and I know which parts are critical to share, and which one would probably get me in trouble. In general, I did manage to find out quite a bit more about the software, the devices, and the protocol — if nothing else, because Abbott left a bunch of debug logging disabled, but built in (and no, this time the WinDbg trick didn’t work because they seems to have added an explicit anti-debugger exception (although, I guess I could learn to defeat that, while I’m at it).

Because I was at first a bit skeptical about my ability to do anything at all with this, I also have been running it in an Ubuntu VM, but honestly I’m considering switching back to my normal desktop because something on the Ubuntu default shell appears to mess with Java, and I can’t even run the VM at the right screen size. I have also considered running this in a Hyper-V virtual machine on my Gamestation, but the problem with that appears to be graphics acceleration: installing OpenSUSE onto it was very fast, but trying to use it was terribly sloppy. I guess the VM option is a bit nicer in the sense that I can just save it to power off the computer, as I did to add the second SSD to the NUC.

After spending the weekend on it, and making some kind of progress, and printing out some of the code to read it on paper in front of the TV with pen and marker, well… I think I’m liking the idea of this but it’ll take me quite a while, alone, to come up with enough of a description that it can be implemented cleanroom. I’ll share more details on that later. For the most part, I felt like I was for the first time cooking something that I’ve only seen made in the Great British Bake Off — because I kept reading the reports that other (much more experienced) people wrote and published, particularly reversing router firmwares.

I also, for once, found a good reason to use YouTube for tutorials. This video by MalwareTech (yes the guy who got arrested after shutting WannaCry down by chance) was a huge help to figure out features I didn’t even know I wanted, including the “Assume Register” option. Having someone who knows what he’s doing explore a tool I don’t know was very helpful, and indeed it felt like Adam Savage describing his workshop tools — a great way to learn about stuff you didn’t know you needed.

My hope is that by adding this tool to my toolbox – like Adam Savage indeed says in his Every Tool’s A Hammer (hugely recommended reading, by the way) – is that I’ll be able to use it not just to solve the mystery of the Libre 2’s encryption. But also that of the nebulous Libre 1 binary protocol, which I never figured out (there’s a few breadcrumbs I already found during my Ghidra weekend). And maybe even to figure out the protocol of one of the gaming mice I have at home, which I never completed either.

Of course all of this assumes I have a lot more free time than I have had for the past few years. But, you know, it’s something that I might have ideas about.

Also as a side note: while doing the work to figure out which address belongs to what, and particularly figure out the jumps through vtables and arrays of global objects (yeah that seems to be something they are doing), I found myself needing to do a lot of hexadecimal calculations. And while I can do conversions from decimal to binary in my head fairly easily, hex is a bit too much for me. I have been using the Python interactive interpreter for that, but that’s just too cumbersome. Instead, I decided to get myself a good old physical calculator — not least because the Android calculator is not able to do hex, and it seems like there’s a lack of “mid range” calculators: you get TI-80 emulators fairly easily, but most of the simplest calculators don’t have hex. Or they do, but they are terrible at it.

I looked up on Amazon for the cheapest scientific calculator that I could see the letters A-F on, and ordered a Casio fx-83GT X — that was a mistake. When it arrived, I realized that I didn’t pay attention to finding one with the hex key on it. The fx-83GT does indeed have the A-F inputs — but they are used for defining variables only, and the calculator does not appear to have any way to convert to hexadecimal nor to do hexadecimal-based operations. Oops.

Instead I ordered a Sharp WriteView EL-W531, which supports hex just fine. It has slightly smaller, but more satisfying, keys, but it’s yet another black gadget on my table (the Casio is light blue). I’ll probably end up looking out for a cute sticker to put on it to see it when I close it for storage.

And I decided to keep the Casio as well — not just because it’s handy to have a calculator at home when doing paperwork, even with all the computers around, but also because it might be interesting to see how much of the firmware is downloadable, and whether someone has tried flashing a different model’s firmware onto it, to expand its capabilities: I can’t believe the A-F keys are there just for the sake of variables, my guess is that they are there because the same board/case is used by a higher model that does support hex, and I’d expect that the only thing that makes it behave one way or the other is the firmware — or even just flags in it!

At any rate, expect more information about the Libre 2 later on this month or next. And if I decide to spend more time on the Casio as well, you’ll see the notes here on the blog. But for now I think I want to get at least some of my projects closer to completion.

USB capturing in 2020

The vast majority of the glucometer devices I reverse the protocol of use USB to connect to a computer. You could say that all of those that I successfully reversed up to now are USB based. Over the years, the way I capture USB packets to figure out a protocol changed significantly, starting from proprietary Windows-based sniffers, and more recently involving my own opensource trace tools. The process evolution was not always intentional — in the case of USBlyzer, it was pretty much dead a few years after I started using it, plus the author refused to document the file format, and by then even my sacrificial laptop was not powerful enough to keep running all the tools I needed.

I feel I’m close to another step on the evolution of my process, and once again it’s not because of me looking to improve the process as much as is the process not working on modern tools. Let me start by explaining what the situation is, because there are two nearly separate issues at play here.

The first issue is that either OpenSuse or the kernel changed the way the debugfs is handled. For those who have not looked at this before, debugfs is what lives in /sys/kernel/debug, and provides the more modern interface for usbmon access; the old method via /dev/usbmonX is deprecated, and Wireshark will not even show up the ability to capture USB packets without debugfs. Previously, I was able to manually change the ownership of the usbmon debugfs paths to my user, and started Wireshark as user to do the capturing, but as of January 2020, it does not seem to be possible to do that anymore: the debugfs mount is only accessible to root.

Using Wireshark as root is generally considered a really bad idea, because it has a huge attack surface, in particular when doing network captures, where the input would literally be to the discretion of external actors. It’s a tinsy bit safer when capturing USB because even when the device is fairly unknown, the traffic is not as controllable, so I would have flinched, but not terribly, to use Wireshark as root — except that I can’t sudo wireshark and have it paint on X. So the remaining alternative is to use tshark, which is a terminal utility that implements the same basics as Wireshark.

Unfortunately here’s the second problem: the last time I ran a lot of captures was when I was working on the Beurer glucometer (which I still haven’t gotten back to, because Linux 5.5 is still unreleased at the time of writing, and that’s the first version that’s not going to go into a reset loop with the device), and I was doing that work from my laptop, and that’s relevant. While the laptop’s keyboard and touchpad are USB, the ports are connected to a different bus internally. Since usbmon interfaces are set by bus, that made it very handy: I only needed to capture on the “ports” bus, and no matter how much and what I typed, it wouldn’t interfere in my captures at all.

You can probably see where this is going: I’m now using a NUC on my desk, with an external keyboard and the Elecom trackball (because I did manage to hurt my wrist while working on the laptop, but that’s a story for another post). And now all the USB 2.0 ports are connected to the same bus. Capturing the bus means getting all the events for keypresses, mouse movements, and so on.

If you have some experience with tcpdump or tshark, you’d think that this is an easy problem to solve: it’s not uncommon having to capture network packets from an SSH connection, which you want to exclude from the capture itself. And the solution for that is to apply a capture filter, such as port not 22.

Unfortunately, it looks like libpcap (which means Wireshark and tshark) does not support capture filters on usbmon. The reasoning provided is that since the capture filters for network are implemented in BPF, there’s no fallback for usbmon that does not have any BPF capabilities in the kernel. I’m not sure about the decision, but there you go. You could also argue that adding BPF to usbmon would be interesting to avoid copying too much data from the kernel, but that’s not something I have particular interest in exploring right now.

So how do you handle this? The suggested option is to capture everything, then use Wireshark to select a subset of packets and save the capture again. This should allow you to have a limited capture that you can share without risking having shared a keylogger off your system. But it also made me think a bit more.

The pcapng format, which Wireshark stores usbmon captures in, is a fairly complicated one, because it can include a lot of different protocol information, and it has multiple typed blocks to store things like hardware interface descriptions. But for USB captures, there’s not much use in the format: not only the Linux and Windows captures (the latter via usbpcap) are different formats altogether, but also the whole interface definition is, as far as I can tell, completely ignored. Instead, if you need a device descriptor, you need to scan the capture for a corresponding request (which usbmon-tools now does.)

I’m now considering just providing a simpler format to store captured data with usbmon-tools, either a simple 1:1 conversion from pcapng, with each packet just size-prefixed, and a tool to filter down the capture on the command line (because honestly, having to load Wireshark to cut down a capture is a pain), or a more complicated format that can store the descriptors separately, and maybe bundle/unbundle them across captures so that you can combine multiple fragments later. If I was in my bubble, I would be using protocol buffers, but that’s not particularly friendly to integrate in a Python module, as far as I can tell. Particularly if you want to be able to use the tools straight out of the git clone.

I guess that since I’m already using construct, I could instead design my own simplistic format. Or maybe I could just bite the bullet, use base64-encoded bytearrays, and write the whole capture session out in JSON.

As I said above, pcapng supports Windows and Linux captures differently: on Linux, the capture format is effectively the wire format of usbmon, while on Linux, it’s the format used by usbpcap. While I have not (yet, at the time of writing) added support to usbmon-tools to load the usbpcap captures, I don’t see why it shouldn’t work out that way. If I do manage to load usbpcap files, though, I would need a custom format to copy these to.

If anyone has a suggestion I’m open to them. One thing that I may try is to use Protocol Buffers but submit the generated source files to parse and serialize the object.

FreeStyle Libre 2: encrypted protocol notes

When I had to write (in a hurry) my comments on Abbott’s DMCA notice to GitHub, I note that I have not had a chance to get my hands onto the Libre 2 system yet, because it’s not sold in the UK. After that, Benjamin from MillionFriends reached out to me to hear if I would be interested in giving a try to figure out how the reader itself speaks with the software. I gladly obliged, and spent most of the time I had during a sick week to get an idea of where we are with this.

While I would lie if I said that we’re now much closer to be able to download data from a Libre 2 reader, I can at least say that we have some ideas of what’s going on right now. So let me try to introduce the problem a second, because there’s a lot of information out there, and while there’s some of it that is quite authoritative (particularly as a few people have been reverse engineering the actual binaries), a lot of it is guesswork and supposition.

The Libre and Libre 2 systems are very similar — they both have sensors, and they both have readers. Technically, you could say that the mobile app (for Android or iOS) also makes up part of the system. The DMCA notice from Abbott that stirred so much trouble above was related to modifications to the mobile application — but luckily for me, my projects and my interests lay quite far away from that. The sensors can be “read” by their respective reader devices, or with a phone with the correct app on it. When reading the sensors with the reader, the reader itself stores the historical data and a bunch more information. You can download that data from the reader onto a computer with Windows or macOS with the official software from Abbott (assuming you can download a version of it that works for you, I couldn’t find a good download page for this on the UK website, but I was provided an EU version of the software as well.)

For the Libre system, I have attempted reversing the protocol, but ultimately it was someone else contributing the details of an usable protocol that I implemented in glucometerutils. For the Libre 2, the discussion already suggested it wouldn’t be the same protocol, and that encryption was used by the Libre 2 reader. As it turns out, Abbott really does not appear to appreciate customers having easy access to their data, or third parties building tools around their devices, and they started encrypting the communication between the sensors and the reader or app in the new system.

So what’s the state with the Libre 2? Well, one of the good news is that the software that I was given works with both the Libre and Libre 2 systems, so I could compare the USB captures on both systems, and that will (as I’ll show in a moment) help significantly. It also showed that the basics of the Abbott HID protocol were maintained: most of the handshake, the message types and the maximum message size. Unfortunately it was clear right away that indeed most of the messages exchanged were encrypted, because the message length made no sense (anything over 62 as length was clearly wrong).

Now, the good news is that I have built up enough interfaces in usbmon-tools that I could use to build a much more reusable extractor for the FreeStyle protocol, and introduce more of the knowledge of the encryption to it, so that it can be used for others. Being able to release these extraction tools I write, instead of just using them myself and letting them rot, was my primary motivation behind building usbmon-tools, so you could say that that’s an achieved target.

So earlier I said that I was lucky the software works with both the Libre and Libre 2 readers. The reason why that was luck, it’s because it shows that the sequence of operations between the two is nearly the same, and that some of the messages are not actually encrypted on the Libre 2 either (namely, the keepalive messages). Here’s the handshake from my Libre 1 reader:

[ 04] H>>D 00000000: 

[ 34] H<<D 00000000: 16                                                . 

[ 0d] H>>D 00000000: 00 00 00 02                                       .... 

[ 05] H>>D 00000000: 

[ 15] H>>D 00000000: 

[ 06] H<<D 00000000: 4A 43 4D 56 30 32 31 2D  54 30 38 35 35 00        JCMV021-T0855. 

[ 35] H<<D 00000000: 32 2E 31 2E 32 00                                 2.1.2. 

[ 01] H>>D 00000000: 

[ 71] H<<D 00000000: 01                                                . 

[ 21] H>>D 00000000: 24 64 62 72 6E 75 6D 3F                           $dbrnum? 

[ 60] H<<D 00000000: 44 42 20 52 65 63 6F 72  64 20 4E 75 6D 62 65 72  DB Record Number
[ 60] H<<D 00000010: 20 3D 20 33 37 32 39 38  32 0D 0A 43 4B 53 4D 3A   = 372982..CKSM:
[ 60] H<<D 00000020: 30 30 30 30 30 37 36 31  0D 0A 43 4D 44 20 4F 4B  00000761..CMD OK
[ 60] H<<D 00000030: 0D 0A                                             .. 

[ 0a] H>>D 00000000: 00 00 37 C6 32 00 34                              ..7.2.4 

[ 0c] H<<D 00000000: 01 00 18 00                                       .... 

Funnily enough, while this matches the sequence that Xavier described for the Insulinx, and that I always reused for the other devices too, I found that most of this exchange is for the original software to figure out which device you connected. And since my tools require you to know which device you’re using, I actually cleaned up the FreeStyle support code in glucometerutils to shorten the initialization sequence.

To describe the sequence in prose, the software is requesting the serial number and software version of the reader (commands 0x05 and 0x15), then initializing (0x01) and immediately using the text command $dbrnum? to know how much data is stored on the device. Then it starts using the “binary mode” protocol that I started on years ago, but never understood.

For both the systems, I captured the connection establishment, from when the device is connected to the Windows virtual machine to the moment when you can choose what to do. The software is only requesting a minimal amount of data, but it’s still quite useful for comparison. Indeed, you can see that after using the binary protocol to fetch… something, the software sends a few more text commands to confirm the details of the device:

[ 0b] H<<D 00000000: 12 3C AD 93 0A 18 00 00  00 00 00 9C 2D EA 00 00  .<..........-...
[ 0b] H<<D 00000010: 00 00 00 0E 00 00 00 00  00 00 00 7C 2A EA 00 00  ...........|*...
[ 0b] H<<D 00000020: 00 00 00 16 00 00 00 00  00 00 00 84 2D EA 00 21  ............-..!
[ 0b] H<<D 00000030: C2 25 43                                          .%C 

[ 21] H>>D 00000000: 24 70 61 74 63 68 3F                              $patch? 

[ 0d] H>>D 00000000: 3D 12 00 00                                       =... 

[ 60] H<<D 00000000: 4C 6F 67 20 45 6D 70 74  79 0D 0A 43 4B 53 4D 3A  Log Empty..CKSM:
[ 60] H<<D 00000010: 30 30 30 30 30 33 36 38  0D 0A 43 4D 44 20 4F 4B  00000368..CMD OK
[ 60] H<<D 00000020: 0D 0A                                             .. 

[ 21] H>>D 00000000: 24 73 6E 3F                                       $sn? 

[ 60] H<<D 00000000: 4A 43 4D 56 30 32 31 2D  54 30 38 35 35 0D 0A 43  JCMV021-T0855..C
[ 60] H<<D 00000010: 4B 53 4D 3A 30 30 30 30  30 33 32 44 0D 0A 43 4D  KSM:0000032D..CM
[ 60] H<<D 00000020: 44 20 4F 4B 0D 0A                                 D OK.. 

[ 21] H>>D 00000000: 24 73 77 76 65 72 3F                              $swver? 

[ 60] H<<D 00000000: 32 2E 31 2E 32 0D 0A 43  4B 53 4D 3A 30 30 30 30  2.1.2..CKSM:0000
[ 60] H<<D 00000010: 30 31 30 38 0D 0A 43 4D  44 20 4F 4B 0D 0A        0108..CMD OK.. 

My best guess on why it’s asking again for serial number and software version, is that the data returned during the handshake is only used to select which “driver” implementation to use, while this is used to actually fill in the descriptor to show to the user.

If I look at the capture of the same actions with a Libre 2 system, the initialization is not quite the same:

[ 05] H>>D 00000000: 

[ 06] H<<D 00000000: 4D 41 47 5A 31 39 32 2D  4A 34 35 35 38 00        MAGZ192-J4558. 

[ 14] H>>D 00000000: 11                                                . 

[ 33] H<<D 00000000: 16 B1 79 F0 A1 D8 9C 6D  69 71 D9 1A C0 1A BC 7E  ..y....miq.....~ 

[ 14] H>>D 00000000: 17 6C C8 40 58 5B 3E 08  A5 40 7A C0 FE 35 91 66  .l.@X[>..@z..5.f
[ 14] H>>D 00000010: 2E 01 37 88 37 F5 94 71  79 BB                    ..7.7..qy. 

[ 33] H<<D 00000000: 18 C5 F6 DF 51 18 AB 93  9C 39 89 AC 01 DF 32 F0  ....Q....9....2.
[ 33] H<<D 00000010: 63 A8 80 99 54 4A 52 E8  96 3B 1B 44 E4 2A 6C 61  c...TJR..;.D.*la
[ 33] H<<D 00000020: 00 20                                             .  

[ 04] H>>D 00000000: 

[ 0d] H>>D 00000000: 00 00 00 02                                       .... 

[ 34] H<<D 00000000: 16                                                . 

[ 05] H>>D 00000000: 

[ 15] H>>D 00000000: 

[ 06] H<<D 00000000: 4D 41 47 5A 31 39 32 2D  4A 34 35 35 38 00        MAGZ192-J4558. 

[ 35] H<<D 00000000: 31 2E 30 2E 31 32 00                              1.0.12. 

[ 01] H>>D 00000000: 

[ 71] H<<D 00000000: 01                                                . 

[x21] H>>D 00000000: 66 C2 59 40 42 A5 09 07  28 45 34 F2 FB 2E EC B2  f.Y@B...(E4.....
[x21] H>>D 00000010: A0 BB 61 8D E9 EE 41 3E  FC 24 AD 61 FB F6 63 34  ..a...A>.$.a..c4
[x21] H>>D 00000020: 7B 7C 15 DB 93 EA 68 9F  9A A4 1E 2E 0E DE 8E A1  {|....h.........
[x21] H>>D 00000030: D6 A2 EA 53 45 2F A8 00  00 00 00 17 CF 84 64     ...SE/........d 

[x60] H<<D 00000000: 7D C1 67 28 0E 31 48 08  2C 99 88 04 DD E1 75 77  }.g(.1H.,.....uw
[x60] H<<D 00000010: 34 5A 88 CA 1F 6D 98 FD  79 42 D3 F2 4A FB C4 E8  4Z...m..yB..J...
[x60] H<<D 00000020: 75 C0 92 D5 92 CF BF 1D  F1 25 6A 78 7A F7 CE 70  u........%jxz..p
[x60] H<<D 00000030: C2 0F B9 A2 86 68 AA 00  00 00 00 F9 DE 0A AA     .....h......... 

[x0a] H>>D 00000000: 9B CA 7A AF 42 22 C6 F2  8F CA 0E 58 3F 43 9C AB  ..z.B".....X?C..
[x0a] H>>D 00000010: C7 4D 86 DF ED 07 ED F4  0B 99 D8 87 18 B5 8F 76  .M.............v
[x0a] H>>D 00000020: 69 50 4F 6C CE 86 CF E1  6D 9C A1 55 78 E0 AF DE  iPOl....m..Ux...
[x0a] H>>D 00000030: 80 C6 A0 51 38 32 8D 01  00 00 00 62 F3 67 2E     ...Q82.....b.g. 

[ 0c] H<<D 00000000: 01 00 18 00                                       .... 

[x0b] H<<D 00000000: 80 37 B7 71 7F 38 55 56  93 AC 89 65 11 F6 7F E6  .7.q.8UV...e....
[x0b] H<<D 00000010: 31 03 3E 15 48 7A 31 CC  24 AD 02 7A 09 62 FF 9C  1.>.Hz1.$..z.b..
[x0b] H<<D 00000020: D4 94 02 C9 5F FF F2 7B  3B AC F0 F7 99 1A 31 5A  ...._..{;.....1Z
[x0b] H<<D 00000030: 00 B8 7B B7 CD 4D D4 01  00 00 00 E2 D4 F1 13     ..{..M......... 

The 0x14/0x33 command/response are new — and they clearly are used to set up the encryption. Indeed, trying to send out a text command without having issued these commands has the reader respond with a 0x33 reply that I interpret as a “missing encryption” error.

But you can also see that there’s a very similar structure to the commands: after the initialization, there’s an (encrypted) text command (0x21) and response (0x60), then there’s an encrypted binary command, and more encrypted binary responses. Funnily enough, that 0x0c response is not encrypted, and the sequence of responses of the same type is very similar between the Libre 1 and Libre 2 captures as well.

The similarities don’t stop here. Let’s look at the end of the capture:

[x0b] H<<D 00000000: A3 F6 2E 9D 4E 13 68 EB  7E 37 72 97 6C F9 7B D6  ....N.h.~7r.l.{.
[x0b] H<<D 00000010: 1F 7B FB 6A 15 A8 F9 5F  BD EC 87 BC CF 5E 16 96  .{.j..._.....^..
[x0b] H<<D 00000020: EB E7 D8 EC EF B5 00 D0  18 69 D5 48 B1 D0 06 A6  .........i.H....
[x0b] H<<D 00000030: 30 1E BB 9B 04 AC 93 DE  00 00 00 B6 A2 4D 23     0............M# 

[x21] H>>D 00000000: CB A5 D7 4A 6C 3A 44 AC  D7 14 47 16 15 40 15 12  ...Jl:D...G..@..
[x21] H>>D 00000010: 8B 7C AF 15 F1 28 D1 BE  5F 38 5A 4E ED 86 7D 20  .|...(.._8ZN..} 
[x21] H>>D 00000020: 1C BA 14 6F C9 05 BD 56  63 FB 3B 2C EC 9E 3B 03  ...o...Vc.;,..;.
[x21] H>>D 00000030: 50 B1 B4 D0 F6 02 92 14  00 00 00 CF FA C2 74     P.............t 

[ 0d] H>>D 00000000: DE 13 00 00                                       .... 

[x60] H<<D 00000000: CE 96 6D CD 86 27 B4 AC  D9 46 88 90 C0 E7 DB 4A  ..m..'...F.....J
[x60] H<<D 00000010: 8D CC 8E AA 5F 1B B6 11  4E A0 2B 08 C0 01 D5 D3  ...._...N.+.....
[x60] H<<D 00000020: 7A E9 8B C2 46 4C 42 B8  0C D7 52 FA E0 8F 58 32  z...FLB...R...X2
[x60] H<<D 00000030: DE 6C 71 3F BE 4E 9A DF  00 00 00 7E 38 C6 DB     .lq?.N.....~8.. 

[x60] H<<D 00000000: 11 06 1C D2 5A AC 1D 7E  E3 4C 68 B2 83 73 DF 47  ....Z..~.Lh..s.G
[x60] H<<D 00000010: 86 05 4E 81 99 EC 29 EA  D8 79 BA 26 1B 13 98 D8  ..N...)..y.&....
[x60] H<<D 00000020: 2D FA 49 4A DF DD F9 5E  2D 47 29 AB AE 0D 52 77  -.IJ...^-G)...Rw
[x60] H<<D 00000030: 2E EB 42 EC 7E CF BB E0  00 00 00 FE D4 DC 7E     ..B.~.........~ 

… Yeah many more encrypted messages …

[x60] H<<D 00000000: 53 FE E5 56 01 BB C2 A7  67 3E A6 AB DB 8E B7 13  S..V....g>......
[x60] H<<D 00000010: 6D F7 80 5C 06 23 09 3E  49 B4 A7 8B D3 61 92 C9  m..\.#.>I....a..
[x60] H<<D 00000020: 72 1D 5A 04 AE E3 3E 05  2E 1B C7 7C 42 2D F8 42  r.Z...>....|B-.B
[x60] H<<D 00000030: 37 88 7E 16 D9 34 8B E9  00 00 00 11 EE 42 05     7.~..4.......B. 

[x21] H>>D 00000000: 01 84 3F 02 36 1E A6 82  E2 C5 BF C2 40 78 B9 CD  ..?.6.......@x..
[x21] H>>D 00000010: E9 55 17 BE E9 16 8A 52  D2 D9 85 69 E4 D5 96 7A  .U.....R...i...z
[x21] H>>D 00000020: 55 6D DF 2E AF 96 36 53  64 C5 C7 D1 B6 6F 1A 1A  Um....6Sd....o..
[x21] H>>D 00000030: 4F 2F 25 FF 58 F4 EE 15  00 00 00 F6 9A 52 64     O/%.X........Rd 

[x60] H<<D 00000000: 19 F4 D4 F0 66 11 E3 CE  47 DE 82 87 22 48 3C 8D  ....f...G..."H<.
[x60] H<<D 00000010: BA 2D C0 37 12 25 CD AB  3A 58 C2 C4 01 88 60 21  .-.7.%..:X....`!
[x60] H<<D 00000020: 15 1E D1 EE F2 90 36 CA  B0 93 92 34 60 F5 89 E0  ......6....4`...
[x60] H<<D 00000030: 64 3C 20 39 BF 4C 98 EA  00 00 00 A1 CE C5 61     d< 9.L........a 

[x21] H>>D 00000000: D5 89 18 22 97 34 CB 6E  76 C5 5A 23 48 F4 5E C6  ...".4.nv.Z#H.^.
[x21] H>>D 00000010: 0E 11 0E C9 51 BD 40 D7  81 4A DF 8A 0B EF 28 82  ....Q.@..J....(.
[x21] H>>D 00000020: 1F 14 47 BC B8 B8 FA 44  59 7A 86 14 14 4B D7 0F  ..G....DYz...K..
[x21] H>>D 00000030: 37 48 CC 1F C5 A2 9E 16  00 00 00 00 A3 EE 69     7H............i 

[x60] H<<D 00000000: 62 33 4B 90 3B 68 3A D1  01 B1 15 4C 48 A1 6E 20  b3K.;h:....LH.n 
[x60] H<<D 00000010: 12 6F BC D5 50 33 9E C3  CC 35 4E C8 46 81 3E 6B  .o..P3...5N.F.>k
[x60] H<<D 00000020: 96 17 DF D5 8C 22 5C 3A  B7 52 C2 D9 37 71 B7 E2  ....."\:.R..7q..
[x60] H<<D 00000030: 5F C4 88 81 2A 91 65 EB  00 00 00 69 E2 A8 DE     _...*.e....i... 

These are once again text commands. In particular one of them gets a response that is long enough to span multiple encrypted text responses (0x60). Given that the $patch? command on the Libre 1 suggested it’s a multirecord command, it might be that the Libre 2 actually has a long list of patches.

So my best guess of this is that, aside for the encryption, the Libre 2 and Libre 1 systems are actually pretty much the same. I’m expecting that the only thing between us and being able to download the data out of a Libre 2 system is to figure out the encryption scheme and whether we need to extract keys to be able to do so. In the latter case that is something we should proceed carefully with, because it’s probably going to be the way Abbott is going to enforce their DMCA requests.

What do we know about the encryption setup, then? Well, I had a theory, but then it got completely trashed. I still got some guesses that for now appear solid.

First of all, the new 0x14/0x33 command/reply: this is called multiple time by the software, and the reader uses the 0x33 response to tell you either the encryption is missing or wrong, so I’m assuming these commands are encryption related. But since there’s more than one meaning for these commands, it looks like the first byte for each of these selects a “sub-command”.

The 0x14,0x11 command appears to be the starting point for the encryption; the device responds with what appears to be 15 random bytes. Not 16! The first byte again appears to be a “typing” specification and is always 0x16. You could say that the 0x14,0x11 command gets a 0x33,0x16 response. In the first three captures I got from the software, the device actually sent exactly the same bytes. Then it started giving a different response for each time I called it. So I guess it might be some type of random value, that needs some entropy to be re-generated. Maybe it’s a nonce for the encryption?

The software then sends a 0x14,0x17 command, which at first seemed to have a number of constant bytes in positions, but now I’m not so sure. I guess I need to compare more captures for that to be the case. But because of the length, there’s at most 25 bytes that are sent to the device.

The 0x33,0x18 response comes back, and it includes 31 bytes, but the last two appear to be constant.

Also if I compare the three captures, two that received the same 0x33,0x16 response, and one that didn’t, there are many identical bytes between the two with the same response (but not all of them!), and very few with the third one. So it sounds like either this is a challenge-response that uses the provided nonces, or it actually uses that value to do the key derivation.

If you’re interested in trying to figure out the possible encryption behind this, the three captures are available on GitHub. And if you find anything else that you want to share with the rest of the people looking at this, please let us know.

Abbott, the Libre 2, and the takedown

A few people today messaged and mentioned me on twitter regarding the news that Abbott has requested the takedown of something related to their Libre 2. I gave a quick hot take on this on Twitter, but I guess it’s worth having something in long form to be referenced, since I’m sure this will be talked about a lot more, not least because of the ominous permalink chosen by Boing Boing (“they-literally-own-you”) and the fact that, game of telephone style, the news went from the original takedown, to Reddit phrasing it as “Abbott asserts copyright on your data”, which is both silly and untrue.

So let’s start with a bit of background, that most of the re-posters of this story probably don’t know much about. The Libre 2 is an upgrade on the FreeStyle Libre system that I wrote a lot about and that I use daily. It comes with both a reader device and with support in the LibreLink app for both Android and (on more recent iPhones) iOS. The main difference with the Libre system is that the sensors provide both NFC and BLE capabilities, with the ability to proactively notify of high- or low-blood sugar conditions, that the old NFC-only sensors cannot provide, which is more similar to CGM solutions like Dexcom‘s.

In both the Libre and Libre 2 systems, the sensors don’t report blood sugar values, like in most classic glucometers. Instead they report a number of “raw” values, including from a number of temperature sensors. There’s a great explanation of these from Pierre Vandevenne, here and here. To get a real blood sugar measurement, you need to apply some algorithm, that Abbott still refines. The algorithm is what I usually refer to as “secret sauce”, and is implemented in both the reader’s firmware and the LibreLink app itself.

Above I used the word “something” to refer to what was taken down. The reason why I say that is that Boing Boing in the title straight up calls this a “tool” — but when you read the linked post from the affected person, it is described as “details of how to patch the LibreLink app”. Since I have not seen what the repository was before it was taken down, I have no idea which one to believe exactly. In either case, it looks like Abbott does not like someone to effectively leverage their “secret sauce” to use in a different application, but in particular, it does not look like we’re talking about something like glucometerutils, that implemented the protocol “clean”, without derivation off the original software.

Indeed, Boing Boing seems to make a case that this is equivalent of implementing a file format: «[…] just because Apple’s Pages can read Word docs, it doesn’t mean that Pages is a derivative of MS Office.» Except that it’s not as clear cut. If you implemented support for one format by copying the implementation code into your software, that actually would make it a derivative work, quite obviously. In this case, if I am to believe the original report instead, the taken down content were instructions to modify Abbott’s app — and not a redistribution of it. Since I’m not a lawyer, I have no idea where that stands, but it’s clearly not as black-and-white as Boing Boing appears to make it.

As I said on twitter, this does not affect either of my projects, since neither is relying on the original software, and are rather descriptions of the protocols. They also don’t include any information or support for the Libre 2, since the protocol appears to have changed. There’s an open issue with discussion, but it also appears that this time Abbott is using some encryption on the protocol. And that might be an interesting problem, as someone might have to get up close and personal with the code to figure that part out — but if that’s the case, we’re back at needing a clean-room design for implementing it.

I also want to quote Pierre explicitly from the posts I linked above:

[…] in the Libre FRAM, what we are seeing is a real “raw” signal. While the measure of the glucose signal itself is fairly reliable, it is heavily post-processed by the Libre firmware. Specifically – and in no particular order – temperature compensation, delay compensation, de-noising… all play a role. That understanding and, to some extent, my MD training, led me to extreme caution and prevented me from releasing my “solution”, which I knew to be both incomplete and unable to handle some error conditions.

The main driver behind my decision was the well known “first do no harm” (primum non nocere) motto, an essential part of the Hippocratic Oath which I symbolically took. I still stick by it today. […]

[…]

Today, there are a lot of add-on devices that aim to transform the Libre into a full CGM. To be honest, in general, I do not like either the results they provide or their (in)convenience. None of those I have tried delivered results that would lead to an approval by a regulatory agency, none of them were stable for long periods of time. But, apparently, patients still feel they are helpful and there is now a thriving community that aims at improving them.

Pierre Vandevenne

While I have not sworn a Hippocratic Oath myself, I have similar concerns to Pierre, and I have explicitly avoided documenting the sensors’ protocol, and I won’t be merging code that tries to read them directly, even if provided.

And when it comes to copyright issues, I do weigh them fairly heavily: they are the fundamental way that Free Software even works, by respecting licenses. So I will prefer someone to provide me with the description of Abbott’s encryption protocol, rather than an implementation of it where I may be afraid of a “poisonous tree.”

Glucometer notes: GlucoRx Nexus

This is a bit of a strange post, because it would be a glucometer review, except that I bought this glucometer a year and a half ago, teased a review, but don’t actually remember if I ever wrote any notes for it. While I may be able to get a new feel for the device to write a review, I don’t even know if the meter is still being distributed, and a few of the things I’m going to write here suggest me that it might not be the case, but who knows.

I found the Nexus as an over-the-counter boxed meter at my local pharmacy, in London. To me it appears like the device was explicitly designed to be used by the elderly, not just because of the large screen and numbers, but also because it comes with a fairly big lever to drop out the test strip, something I had previously only seen in the Sannuo meter.

This is also the first meter I see with an always-on display — although it seems that the backlight turns on only when the device is woken up, and otherwise is pretty much unreadable. I guess they can afford this type of display given that the meter is powered by 2 AAA batteries, rather than CR2032 like others.

As you may have guessed by now from the top link about the teased review, this is the device that uses a Silicon Labs CP2110 HID-to-UART adapter, for which I ended up writing a pyserial driver, earlier this year. The software to download the data seems to be available from the GlucoRx website for Windows and Mac — confusingly, the website you actually download the file from is not GlucoRx’s but Taidoc’s. TaiDoc Technology Corporation being named on the label under the device, together with MedNet GmbH. A quick look around suggests TaiDoc is a Taiwanese company, and now I’m wondering if I’m missing a cultural significance around the test strips, or blood, and the push-out lever.

I want to spend a couple notes about the Windows software, which is the main reason why I don’t know if the device is still being distributed. The download I was provided today was for version 5.04.20181206 – which presumes the software was still being developed as of December last year – but it does not seem to be quite tested to work on Windows 10.

The first problem is that that the Windows Defender malware detection tool actually considers the installer itself as malware. I’m not sure why, and honestly I don’t care: I’m only using this on a 90-days expiring Windows 10 virtual machine that barely has access to the network. The other problem, is that when you try to run the setup script (yes, it’s a script, it even opens a command prompt), it tries to install the redistributable for .NET 3.5 and Crystal Reports, fail and error out. If you try to run the setup for the software itself explicitly, you’re told you need to install .NET 3.5, which is fair, but then it opens a link from Microsoft’s website that is now not found and giving you a 404. Oops.

Setting aside these two annoying, but not insurmountable problems, what remains is to figure out the protocol behind the scenes. I wrote a tool that reads a pcapng file and outputs the “chatter”, and you can find it in the usbmon-tools repository. It’s far from perfect and among other things it still does not dissect the actual CP2110 protocol — only the obvious packets that I know include data traffic to the device itself.

This is enough to figure out that the serial protocol is one of the “simplest” that I have seen. Not in the sense of being easy to reverse, but rather in term of complexity of the messages: it’s a ping-pong protocol with fixed-length 8-bytes messages, of which the last one is a simple checksum (sum-modulo-8-bit), a fixed start byte of 0x51, and a fixed end with a bit for host-to-device and device-to-host selection. Adding to the first nibble of the message to always have the same value (2), it brings down the amount of data to be passed for each message to 34-bit. Which is a pretty low amount of information even when looking at simple information as glucose readings.

At any rate, I think I already have a bit of the protocol figured out. I’ll probably finish it over the next few days and the weekend, and then I’ll post the protocol in the usual repository. Hopefully if there are other users of this device they can be well served by someone writing a tool to download the data that is not as painful to set up as the original software.

Introducing usbmon-tools

A couple of weeks ago I wrote some notes about my work in progress to implement usbmon captures handling code, and pre-announced I was going to publish more of my extraction/inspection scripts.

The good news is that the project is now released, and you can find it on GitHub as usbmon-tools with an Apache 2.0 license, and open to contributions (with a CLA, sorry about that part). This is the first open source project I release using my employer’s releasing process (for other projects, I used the IARC process instead), and I have to say I’m fairly pleased with the results.

This blog post is meant mostly as a way to explain what’s going on my head regarding this project, with the hope that contributors can help it become reality. Or that they can contribute other ideas to it, even when they are not part of my particular plans.

I want to start with a consideration on the choice of language. usbmon-tools is written in Python 3. And in particular it is restricted to Python 3.7, because I wanted to have access to type annotations, which I found extremely addictive at work. I even set up Travis CI to run mypy as part of the integration tests for the repository.

For other projects I tend to be more conservative, and wait for Debian stable to have a certain version before requiring that as a minimum, but as this is a toolset for developers primarily, I’m going to expect its public to be able to deal with Python 3.7 as the requirement. This version was released nearly a year ago, and that should be plenty of time for people to have one at hand.

As for what the project should achieve in my view, is an easy way for developers to dissect an USB snooping trace. I started by building a simplistic tool that recreates a text format trace from the pcapng file, based on the official documentation of usbmon in the kernel (I have some patches to improve on that, too, but that probably will become a post in by itself next week). It’s missing isochronous support, and it’s not totally tested, but it at least gave me a few important insight on the format itself, including the big caveat that the “id” (or tag) of the URBs is not unique.

Indeed, I think that alone is one of the most important pieces of the puzzle in the library: in addition to parsing the pcapng file itself, the library can re-tag the events so that they get a real unique identifier (UUID), making it significantly easier to analyze the traces.

My next steps on the project are to write a more generic tool to convert a USB capture into what I call my “chatter format” (similar to the one I used to discuss serial protocols), and a more specific one that converts HID traces (because HID is a more defined protocol, and we can go a level deeper in exposing this into a human-readable source). I’m also considering if it would be within reach to provide the tool a HID descriptor blob, parse it and have it used to parse the HID traffic based on it. It would make some debugging particularly easier, for instance the stuff I did when I was fixing the ELECOM DEFT trackball.

I would also love to be able to play with a trace in a more interactive manner, for instance by loading this into Jupyter notebook, so that I could try parsing the blobs interactively, but unless someone with more experience with those contributes the code, I don’t expect I’ll have much time for it.

Pull requests are more than welcome!

Working with usbmon captures

Two years ago I posted some notes on how I do USB sniffing. I have not really changed much since then, although admittedly I have not spent much time reversing glucometers in that time. But I’m finally biting the bullet and building myself a better setup.

The reasons why I’m looking for a new setup are multiple: first of all, I now have a laptop that is fast enough to run a Windows 10 VM (with Microsoft’s 90 days evaluation version). Second, the proprietary software I used for USB sniffing has not been updated since 2016 — and they still have not published any information about their CBCF format, despite their reason being stated as:

Unfortunately, there is no such documentation and I’m almost sure will
never be. The reason is straightforward – every documented thing
should stay the same indefinitely. That is very restrictive.

At this point, keeping my old Dell Vostro 3750 as a sacrificial machine just for reverse engineering is not worth it anymore. Particularly when you consider that it started being obsoleted by both software (Windows 10 appears to have lost the ability to map network shares easily, and thus provide local-network backups), and hardware (the Western Digital SSD that I installed on it can’t be updated — their update package only works for UEFI boot systems, and while technically that machine is UEFI, it only supports the CSM boot).

When looking at a new option for my setup, I also want to be able to publish more of my scripts and tooling, if nothing else because I would feel more accomplished by knowing that even the side effects of working on these projects can be reused. So this time around I want to focus on all open source tooling, and build as much of the tools to be suitable for me to release as part of my employer’s open source program, which basically means not include any device-specific information within the tooling.

I started looking at Wireshark and its support for protocol dissectors. Unfortunately it looks like USB payloads are a bit more complicated, and dissector support is not great. So once again I’ll be writing a bunch of Python scripts to convert the captured data into some “chatter” files that are suitable for human consumption, at least. So I started to take a closer look at the usbmon documentation (the last time I looked at this was over ten years ago), and see if I can process that data directly.

To be fair, Wireshark does make it much nicer to get the captures out, since the text format usbmon is not particularly easy to parse back into something you can code with — and it is “lossy” when compared with the binary structures. With that, the first thing to focus on is to support the capture format Wireshark generates, which is pcapng, with one particular (out of many) USB capture packet structures. I decided to start my work from that.

What I have right now, is an (incomplete) library that can parse a pcapng capture into objects that are easier to play with in Python. Right now it loads the whole content into memory, which might or might not be a bad limitation, but for now it will do. I guess it would also be nice if I can find a way to integrate this with Colaboratory, which is a tool I only have vague acquaintance with, but would probably be great for this kind of reverse engineering, as it looks a lot like the kind of stuff I’ve been doing by hand. That will probably be left for the future.

The primary target right now is for me to be able to reconstruct the text format of usbmon given the pcapng capture. This would at least tell me that my objects are not losing details in the construction. Unfortunately this is proving harder than expected, because the documentation of usbmon is not particularly clear, starting from the definition of the structure, that mixes sized (u32) and unsized (unsigned int) types. I hope I’ll be able to figure this out and hopefully even send changes to improve the documentation.

As you might have noticed from my Twitter rants, I maintain that the documentation needs an overhaul. From mention of “easy” things, to the fact that the current suggested format (the binary structures) is defined in terms of the text format fields — except the text format is deprecated, and the kernel actually appears to produce the text format based on the binary structures. There are also quite a few things that are not obviously documented in the kernel docs, so you need to read the source code to figure out what they mean. I’ll try rewriting sections of the documentation.

Keep reading the blog to find updates if you have interests in this.