Working with usbmon captures

Two years ago I posted some notes on how I do USB sniffing. I have not really changed much since then, although admittedly I have not spent much time reversing glucometers in that time. But I’m finally biting the bullet and building myself a better setup.

The reasons why I’m looking for a new setup are multiple: first of all, I now have a laptop that is fast enough to run a Windows 10 VM (with Microsoft’s 90 days evaluation version). Second, the proprietary software I used for USB sniffing has not been updated since 2016 — and they still have not published any information about their CBCF format, despite their reason being stated as:

Unfortunately, there is no such documentation and I’m almost sure will
never be. The reason is straightforward – every documented thing
should stay the same indefinitely. That is very restrictive.

At this point, keeping my old Dell Vostro 3750 as a sacrificial machine just for reverse engineering is not worth it anymore. Particularly when you consider that it started being obsoleted by both software (Windows 10 appears to have lost the ability to map network shares easily, and thus provide local-network backups), and hardware (the Western Digital SSD that I installed on it can’t be updated — their update package only works for UEFI boot systems, and while technically that machine is UEFI, it only supports the CSM boot).

When looking at a new option for my setup, I also want to be able to publish more of my scripts and tooling, if nothing else because I would feel more accomplished by knowing that even the side effects of working on these projects can be reused. So this time around I want to focus on all open source tooling, and build as much of the tools to be suitable for me to release as part of my employer’s open source program, which basically means not include any device-specific information within the tooling.

I started looking at Wireshark and its support for protocol dissectors. Unfortunately it looks like USB payloads are a bit more complicated, and dissector support is not great. So once again I’ll be writing a bunch of Python scripts to convert the captured data into some “chatter” files that are suitable for human consumption, at least. So I started to take a closer look at the usbmon documentation (the last time I looked at this was over ten years ago), and see if I can process that data directly.

To be fair, Wireshark does make it much nicer to get the captures out, since the text format usbmon is not particularly easy to parse back into something you can code with — and it is “lossy” when compared with the binary structures. With that, the first thing to focus on is to support the capture format Wireshark generates, which is pcapng, with one particular (out of many) USB capture packet structures. I decided to start my work from that.

What I have right now, is an (incomplete) library that can parse a pcapng capture into objects that are easier to play with in Python. Right now it loads the whole content into memory, which might or might not be a bad limitation, but for now it will do. I guess it would also be nice if I can find a way to integrate this with Colaboratory, which is a tool I only have vague acquaintance with, but would probably be great for this kind of reverse engineering, as it looks a lot like the kind of stuff I’ve been doing by hand. That will probably be left for the future.

The primary target right now is for me to be able to reconstruct the text format of usbmon given the pcapng capture. This would at least tell me that my objects are not losing details in the construction. Unfortunately this is proving harder than expected, because the documentation of usbmon is not particularly clear, starting from the definition of the structure, that mixes sized (u32) and unsized (unsigned int) types. I hope I’ll be able to figure this out and hopefully even send changes to improve the documentation.

As you might have noticed from my Twitter rants, I maintain that the documentation needs an overhaul. From mention of “easy” things, to the fact that the current suggested format (the binary structures) is defined in terms of the text format fields — except the text format is deprecated, and the kernel actually appears to produce the text format based on the binary structures. There are also quite a few things that are not obviously documented in the kernel docs, so you need to read the source code to figure out what they mean. I’ll try rewriting sections of the documentation.

Keep reading the blog to find updates if you have interests in this.

A misc weekend

So today is the first week without Gentoo work for me. I admit I feel a bit disappointed, I used to do a lot of stuff and now that I can’t do them anymore (for my choice too), I feel like I’m useless. But I’m resisting the temptation to feel my overlay with a lot of ebuilds for the bumps that I would have done myself previously.

First of all, I wish to thank the anonymous “bla” who commented on my previous post about the lack of analysers for the usbmon output, suggesting me Wireshark. I really didn’t know they were working on supporting USB sniffing, and yes that is nice and stops me from deciding to write an analyser myself. Unfortunately, the current weekly snapshot of libpcap and the current subversion trunk of Wireshark does not get along well together: they simply don’t allow me to sniff at all, I get a series of errors instead.

Talking about the Chinese adapter, I found something on their site but I’m not really sure if it’s a specification or simply a description of the driver interface. Nonetheless, I’ve been able to identify a few patterns in the logs I was able to get with usbmon (analysed by hand), and I now know two major type of commands that are sent to the device: a pair of set and get commands used to set and get a 16 bit word out of at least three «registers»… I’m not sure if they are registers at all, they only are three 16-bit words which have another 16-bit word assigned to them, and which is set and get with some vendor-type requests. Unfortunately, I haven’t been able to get a hold of their actual meaning yet.

I’ve also found a sort of “commit” request, or a “get internal status” command, that tells me I haven’t initialised the device correctly yet: when handled by the Windows drivers, this request always return 0x9494, while here it returns values like 0xf4f4 or similar, never the actual 0x9494 word I’m expecting.

I didn’t work on Rust in the past few days but I have been thinking a lot about it, as David is being the first user of it, and submitted me a few reports of features currently missing, which I didn’t think of before, namely support for multiple constructors, and proper support for sub-classes. I haven’t worked on this yet, but I’ll do soon. He also provided me a patch that I merged, but I’ll have to write a series of testcase for that too, so that I can actually be sure that Rust continue behaving as needed.

Today I also decided to look a bit more on xine, as lately I’ve been neglecting it a lot. I’ve tried to cleanup my Wavpack demuxer a bit more, and I just finished a Valgrind (massif) run while playing two wavpack files one after the other. The results are somewhat interesting: xine does not leak memory, as the memory consumption remains constant between the two playbacks, but it also seems to allocate about ten megabytes of memory in _x_video_decode that are not used at all, as I’ve disabled the effect plugins, and so there should be no video at all. About seven more megabytes are wasted in the overlay handling, which also shows that the video output is not correctly disabled while playing audio tracks, with xine-ui at least.

I’ll be running a callgrind now also to test where time is actually lost, it’s good now that I have KCacheGrind displaying properly the callgraph, thanks to Albert Astals Cid (TSDgeos from KDE). One thing that I’m afraid of is that xine does not really have good performances on lossless files because it uses too small buffers, at least by default; I have this doubt because while mp3s play mostly fine – with mad decoder, not FFmpeg’s – it skips a lot for both Flac and WavPack files. This is especially seen when you run it through Valgrind, as the code is obviously slowed down and shows the need for improvements.

If only there was support in ruby-prof for Valgrind-style output to analyse the output with KCacheGrind, I could be improving Rust too. But that will have to wait I’m afraid.

Oh well.

Sheer luck, or lack of?

Okay, so today I spent the whole day following a bug for what I’m handling for my job, and when evening came, I was tired and in need of something that relaxed me.

Missing still the information to actually start taking seriously the idea of building model ships, and missing a jigsaw (I looked for some the other day at the supermarket, but I can’t find a subject that I like, most of them are paintings of the Renaisance, but I’m not really into that kind of art, I never liked studying it when I was at school, but I admit that graphic arts are not really something I’m into in general, but if it had to be a painting, I would have preferred finding something from Magritte.. I still remember seeing one of his works at the Guggenheim Musem here in Venice some time ago and being quite impressed; surrealism can be a kind of graphic art that I make myself like, it’s actually interesting), my best choice to get a relaxing night was to start looking for the W.ch USB serial converter driver.

I installed the only Windows I still have a license of (Windows 98) on a virtual machine, and set up usbmon in the kernel so that I could use it (it would be nice if the usbmon module depended on the debugfs support, as the latter is needed to use the former), and then started fighting to get the device up. After some usual Win98 driver setup (install driver, insert peripheral, remove driver, remove peripheral, re-insert peripheral, install driver, reboot, curse, remove peripheral, remove driver and so on), I’ve started having a working COM3 port with the adapter. Too bad that anything I tried to type was sent two or more times over the wires, but I knew VMware was far from perfect with USB.

Anyway, I started looking at the output of the usbmon tracing, and I was able to isolate some of the control messages used to set the parameters up (that are basically the only thing I need, as the actual I/O seems to happen just like PL2303 code (or usbserial) handles it); I’ve got the commit sequence and the okay string that I should expect from the driver, but I still have to make sense of all of them.

Unfortunately the USBmon (note the case) utility is written in Java. I say unfortunately simply because I still can’t get Sun’s JRE to work with XCB-enabled libX11, so I can’t run the utility on my box.

After some tries, which finally resulted in an hard lockup of Enterprise, which is now shut down waiting for me resuming it tomorrow, I finally decided that the next days I’ll start by writing some tools that might help me during the development of the driver, and maybe someone else in the future.

The first tool will be a serial signal generator: basically it would be a simple app that repeated the same string over and over and over on a serial port, using the configured baud rate, bits, parity and stop bits. This is useful because I can connect the adapter in look with my real serial port, leave the generator open on the serial port, and try sending different commands to the adapter till I get the signal as I need it, then I can move to try another parameter, for instance changing the stop bits, and try the commands again until I get the same parameters as configured in the generator. Right now I’ve used for this a simple C source file that sent “Flameeyes” string with the hardcoded parameters, I’d like to be able to set the parameters at runtime so you don’t have to stop and start it every time, but I’m not yet sure of which interface I could use for that.

The second tool is the one that most likely will be helpful if I can get it as good as I hope, and it would be a Ruby-based ncurses frontend for reading the data out of usbmon. First it would show the data in a more structured fashion, with the list of read packets that can be actually inspected without having to copy them around, with a ringbuffer for the read packets, and a different buffer for particular packets you want to inspect while continuing recording, and if I can with some kind of filters so that I can pinpoint the request for the single device, and then start discarding stuff like the I/O operations I don’t care about while trying to set the control stuff.

It’s far from being an easy task (especially since I never used ncurses), and I know I probably won’t be able to complete it as I want to, but at least I can try to provide a start if someone else will need to read that usbmon data in a more comprehensible shape. Right now SnoopyPro on Windows has a simpler view of the data, although it can really improve, especially to get a decent output to print and study.

But now, to actually make some sense out of the title I used for this blog, I have to tell you what happened while I was debugging. When I was ready to start messing with the parameters of the device, I wanted to be able to get no serial connection, and just start with the pristine adapter status, so I took one of the three remaining adapters (I ordered four from the same place) and connected it to Enterprise, launched picocom (that I started using as a test – you can find it in my overlay if you want it) on the ttyUSB0 device and.. found it not being there. An lsusb run later I seen that the adapter I just took was actually a PL2303-based device, and after looking at the other two adapters, I discovered that out of the four adapters I bought, only one has actually the W.ch chip on itself, and that is the one I chosen when I wanted to try them out, luck or lack of it?

But anyway I won’t stop working on the driver, I’ll probably take it more easy though, as three serial ports are enough for now, I ordered one more “just in case”, and that proven a good idea. I have my job to complete, so I’ll dedicate to the driver only my free time, compatibly with the other projects I’m working on. Not only I can’t really try to get the adapter replaced (I ordered it from an Hong Kong reseller on e-bay, the price to sending it back would probably be higher than I paid for it, besides I can’t really complain, the adapters were cheap and they never told me which chip they used), but I have no intention to: writing a driver for this adapter is an interesting challenge, it can be useful to other people, might be useful to me again in the future (I might need more serial adapters, and I’ll probably just order them as I ordered these ones, so I might find more W.ch-based devices), and it’s a nice way to start kernel hacking, as it’s a simple enough device.