Don’t Ignore Windows 10 as a Development Platform for FLOSS

Important Preface: This blog post was written originally on 2020-05-12, and scheduled for later publication, inspired by this short Twitter thread. As such it well predates Microsoft’s announcement of expanding support of WSL2 to graphical apps. I considered trashing, or seriously re-editing the blog post in the light of the announcement, but I honestly lack the energy to do that now. It left a bad taste in my mouth to know that it will likely get drowned out in the noise of the new WSL2 features announcement.

Given the topic of this post I guess I need to add a preface to point out my “FLOSS creds” — because I have seen already too many attacks to people who even use Windows at all. I have been an opensource developer for over fifteen years now, and part of the reason why I left my last bubble was because it made it difficult for me to contribute to various opensource projects. I say this because I’m clearly a supporter of Free Software and Open Source, wherever possible. I also think that’s different people have different needs, and that ignoring that is a failure of the FLOSS movement as a whole.

The “Year of Linux on the Desktop” is now a meme that has been running its course to the point of being annoying. Despite what FLOSS advocates keep saying, “Linux on the Desktop” is not really moving, and while I do have some strong opinions on this, that’s for another day. Most users, and in particular newcomers to FLOSS (both as users and developers) are probably using a more “user friendly” platform — if you leave a comment with the joke on UNIX being selective with its friends, you’ll end up on a plonkfile, be warned.

About ten years ago, it seemed like the trend was for FLOSS developers to use MacBooks as their daily laptops. I did that for a while myself — an UNIX-based platform with all the tools of the trade, which allowed quite a bit of work being done without having access to a Linux platform. SSH, Emacs, GCC, Ruby, and so on. And at the same time, you had the stability of Mac OS X, with the battery life and all the hardware worked great out of the box. But then more recently, Apple’s move towards “walled gardens” seemed to be taking away from this feasibility.

But back to the main topic. Over the past many years, I’ve been using a “mixed setup” — using a Linux laptop (or more recently desktop) for development, and a Windows (7, then 10) desktop for playing games, editing photos, designing PCBs, and for logic analysis. The latter is because Saleae Logic takes a significant amount of RAM when analysing high-frequency signals, and I have been giving my gamestations as much RAM as I can just for Lightroom, so it makes sense to run it on the machine with 128GB of RAM.

But more recently I have been exploring the ability of using Windows 10 as a development platform. In part because my wife has been learning Python, and since also learning a new operating system and paradigm at the same time would have been a bloody mess, she’s doing so on Windows 10 using Visual Studio Code and Python 3 as distributed through the Microsoft Store. While helping her, I had exposure to Windows as a Python development platform, so I gave it a try when working on my hack to rename PDF files, which turned out to be quite okay for a relatively simple workflow. And the work on the Python extension keeps making it more and more interesting — I’m not afraid to say that Visual Studio Code is better integrated with Python than Emacs, and I’m a long-time user of Emacs!

In the last week I have actually stepped up further how much development I’m doing on Windows 10 itself. I have been using HyperV virtual machines for Ghidra, to make use of the bigger screen (although admittedly I’m just using RDP to connect to the VM so it doesn’t really matter that much where it’s running), and in my last dive into the Libre 2 code I felt the need to have a fast and responsive editor to go through executing part of the disassembled code to figure out what it’s trying to do — so once again, Visual Studio Code to the rescue.

Indeed, Windows 10 now comes with an SSH client, and Visual Studio Code integrates very well with it, which meant I could just edit the files saved in the virtual machine and have the IDE also build them with GCC and executing them to get myself an answer.

Then while I was trying to use packetdiag to prepare some diagrams (for a future post on the Libre 2 again), I found myself wondering how to share files between computers (to use the bigger screen for drawing)… until I realised I could just install the Python module on Windows, and do all the work there. Except for needing sed to remove an incorrect field generated in the SVG. At which point I just opened my Debian shell running in WSL, and edited the files without having to share them with anything. Uh, score?

So I have been wondering, what’s really stopping me from giving up my Linux workstation for most of the time? Well, there’s hardware access — glucometerutils wouldn’t really work on WSL unless Microsoft is planning a significant amount of compatibility interfaces to be integrated. Similar for using hardware SSH tokens — despite PC/SC being a Windows technology to begin with. Screen and tabulated shells are definitely easier to run on Linux right now, but I’ve seen tweets about modern terminals being developed by Microsoft and even released FLOSS!

Ironically, I think it’s editing this blog that is the most miserable experience for me on Windows. And not just because of the different keyboard (as I share the gamestation with my wife, the keyboard is physically a UK keyboard — even though I type US International), but also because I miss my compose key. You may have noticed already that this post is full of em-dashes and en-dashes. Yes, I have been told about WinCompose, but last time I tried using it, it didn’t work and even screwed up my keyboard altogether. I’m now trying it again, at least on one of my computers, and if it doesn’t explode in my face again, I may just give it another try later.

And of course it’s probably still not as easy to set up a build environment for things like unpaper (although at that point, you can definitely run it in WSL!), or to have a development environment for actual Windows applications. But this is all a matter of different set of compromises.

Honestly speaking, it’s very possible that I could survive with a Windows 10 laptop for my on-the-go opensource work, rather than the Linux one I’ve been using. With the added benefit of being able to play Settlers 3 without having to jump through all the hoops from the last time I tried. Which is why I decided that the pandemic lockdown is the perfect time to try this out, as I barely use my Linux laptop anyway, since I have a working Linux workstation all the time. I have indeed reinstalled my Dell XPS 9360 with Windows 10 Pro, and installed both a whole set of development tools (Visual Studio Code, Mu Editor, Git, …) and a bunch of “simple” games (Settlers, Caesar 3, Pharaoh, Age of Empires II HD); Discord ended up in the middle of both, since it’s actually what I use to interact with the Adafruit folks.

This doesn’t mean I’ll give up on Linux as an operating system — but I’m a strong supporter of “software biodiversity”, so the same way I try to keep my software working on FreeBSD, I don’t see why it shouldn’t work on Windows. And in particular, I always found that providing FLOSS software on Windows a great way to introduce new users to the concept of FLOSS — focusing more on providing FLOSS development tools means giving an even bigger chance for people to build more FLOSS tools.

So is everything ready and working fine? Far from it. There’s a lot of rough edges that I found myself, which is why I’m experimenting with developing more on Windows 10, to see what can be improved. For instance, I know that the reuse-tool has some rough edges with encoding of input arguments, since PowerShell appears to still not default to UTF-8. And I failed to use pre-commit for one of my projects — although I have not taken notice yet much of what failed, to start fixing it.

Another rough edge is in documentation. Too much of it assumes only a UNIX environment, and a lot of it, if it has any support for Windows documentation at all, assumes “old school” batch files are in use (for instance for Python virtualenv support), rather than the more modern PowerShell. This is not new — a lot of times modern documentation is only valid on bash, and if you were to use an older operating system such as Solaris you would find yourself lost with the tcsh differences. You can probably see similar concerns back in the days when bash was not standard, and maybe we’ll have to go back to that kind of deal. Or maybe we’ll end up with some “standardization” of documentation that can be translated between different shells. Who knows.

But to wrap this up, I want to give a heads’ up to all my fellow FLOSS developers that Windows 10 shouldn’t be underestimated as a development platform. And that if they intend to be widely open to contributions, they should probably give a thought of how their code works on Windows. I know I’ll have to keep this in mind for my future.

Code Validation and Reviews

During my “staycation” week I decided to spend some time looking at my various Python projects, trying to make it easier, nicer and cleaner both for contributors and for myself. I also decided to spend some time to improve some of the modules I use, based on various lessons I learned in my own tools — and that got me to learn more about those tools.

First of all, thanks to Ben, glucometerutils has been fully re-formatted with black and isort already. And he set it up with pre-commit to make sure that new changes don’t break these formattings. This is awesome.

As I was recently discussing with Samuele on Twitter, it’s not that I always agree with the formatting choices in black. It’s that it takes away the subjective matter of agreement on formatting. Black does not always make the code look like the way I would make it look like, and I could argue that I can do a better job than black. But it’s also true that it makes everybody’s code look the same, which is actually a great way to fix it.

This is something I definitely picked up in my dayjob — I have been part of the Python Readability program (you can find more about it in the Building Secure and Reliable Systems book, downloadable for free), and through that I have reviewed literally hundreds of changes, coming from all different parts of Alphabet. While my early engagement had lots of comments on code formatting, the moment when Python formatting tools became more reliable and their usage became widespread, the “personal load” of doing those reviews went down significantly, as I could finally focus on pointing out overly complex functions, mismatching code and documentation, and so on. For everything else, my answer was unchanging: “let the formatter do its job, maybe just add a comma there as a suggestion to it.”

This is why I’m happy about black — not that I think its formatting is 100% what I would do. But because it gets close enough, and removes the whole argument or uncertainty around it. The same applies to isort, as well.

While applying pretty much the same set of presubmits and formatting to python-pcapng, I also found out flake8. This is another useful tool to reduce the work needed for reviews, and it also can be configured to run as part of the pre-commit hooks, making sure that violations are identified sooner rather than later. While the tool is designed to deal with styleguide violations, it also turned out to identify a few outright mistakes in glucometerutils. I’m now going to apply it throughout, whenever I can.

There’s more checks I would actually want to integrate — today I was going through all the source files in glucometerutils to update the type annotations, since I dropped Python 3.6 support. As I went to do that I realised that one of the files created in a pull request I approved recently was actually missing licensing information. I have now added both license and copyright annotations as suggested by Matija (who is an actual lawyer, unlike me) — but would love a pre-commit check that just ensured that all the files have a license, copyright, and that they have the expected license, for instance.

There’s a few more trivial checks available in pre-commit that I may actually enable throughout: checks for trailing whitespace, and missing newlines at end of files. All of those are easily fixed, and the fixers do exactly that, which is also a great way to make the tests easier on newcomers and contributors: you don’t just get told that “it’s wrong”, but also “let me fix that for you already”.

It’s not quite as encompassing as the bubble I’m used to, but it seems to be the closest I’m getting to it right now. Maybe I should just start building all the hooks that I feel I need, and see if someone else will adopt them afterwards. It seems to be a common thing to do after all.

Anyone who has written Gentoo ebuilds, by the way, have most likely recognized similarities with repoman, the tool used to validate ebuilds before submitting them to the tree. I think that’s possibly why I’m so interested in this. Because I do agree that tools like repoman are the way to go, and have insisted myself for repoman to be extended to cover more and more cases over time, as it would stop divegence.

I honestly hope to get to a point where there’s no argument made over a code change, on whether it complies to style or not — but rather leaving the enforcement (and the fixing) to computers, whenever it is possible. And that also means helping the computers making it possible by being less picky on things that can be overlooked.

Planets, Clouds, Python

Half a year ago, I wrote some thoughts about writing a cloud-native feed aggregator. I actually started drawing some ideas of how I would design this myself since, and I even went through the (limited) trouble of having it approved for release. But I have not actually released any code, or to be honest, I have not written any code either. The repository has been sitting idle.

Now, with the Python 2 demise coming soon, and me not interested in keeping around a server nearly only to run Planet Multimedia, I started looking into this again. The first thing that I realized is that I both want to reuse as much code exist out there as I can, and I want to integrate with “modern” professional technologies such as OpenTelemetry, which I appreciate from work, even if it sounds like overkill.

But that’s where things get complicated: while going full “left-pad” of having a module for literally everything is not something you’ll find me happy about, a quick look at feedparser, probably the most common module to read feeds in Python, shows just how much code is spent trying to cover for old Python versions (before 2.7, even), or to implement minimal-viable-interfaces to avoid mandatory dependencies at all.

Thankfully, as Samuel from NewsBlur pointed out, it’s relatively trivial to just fetch the feed with requests, and then pass it down to feedparser. And since there are integration points for OpenTelemetry and requests, having an instrumented feed fetcher shouldn’t be too hard. That’s going to probably be my first focus when writing Tanuga, next weekend.

Speaking of NewsBlur, the chat with Samuel also made me realize how much of it is still tied to Python 2. Since I’ve gathered quite a bit of experience in porting to Python 3 at work, I’m trying to find some personal time to contribute smaller fixes to run this in Python 3. The biggest hurdle I’m having right now is to set it up on a VM so that I can start it up in Python 2 to begin with.

Why am I back looking at this pseudo-actively? Well, the main reason is that rawdog is still using Python 2, and that is going to be a major pain with security next year. But it’s also the last non-static website that I run on my own infrastructure, and I really would love to get rid of entirely. Once I do that, I can at least stop running my own (dedicated or virtual) servers. And that’s going to save me time (and money, but time is the most important one here too.)

My hope is that once I find a good solution to migrate Planet Multimedia to a Cloud solution, I can move the remaining static websites to other solutions, likely Netlify like I did for my photography page. And after that, I can stop the last remaining server, and be done with sysadmin work outside of my flat. Because honestly, it’s not worth my time to run all of these.

I can already hear a few folks complaining with the usual remarks of “it’s someone else’s computer!” — but the answer is that yes, it’s someone else’s computer, but a computer of someone who’s paid to do a good job with it. This is possibly the only way for me to manage to cut away some time to work on more Open Source software.

CP2110 Update for 2019

The last time I wrote about the CP2110 adapter was nearly a year ago, and because I have had a lot to keep me busy since, I have not been making much progress. But today I had some spare cycles and decided to take a deeper look starting from scratch again.

What I should have done properly since then would have been procuring myself a new serial dongle, as I was not (and still am) not entirely convinced about the quality of the CH341 adapter I’m using. I think I used that serial adapter successfully before, but maybe I didn’t and I’ve been fighting with ghosts ever since. This counts double as, silly me, I didn’t re-read my own post when I resumed working on this, and been scratching my head at nearly exactly the same problems as last time.

I have some updates first. The first of which is that I have some rough-edged code out there on this GitHub branch. It does not really have all the features it should, but it at least let me test the basic implementation. It also does not actually let you select which device to open — it looks for the device with the same USB IDs as I have, and that might not work at all for you. I’ll be happy to accept pull requests to fix more of the details, if anyone happen to need something like this too — once it’s actually in a state where it can be merged, I’ll be doing a squash commit and send a pull request upstream with the final working code.

The second is that while fighting with this, and venting on Twitter, Saleae themselves put me on the right path: when I said that Logic failed to decode the CP2110→CH341 conversation at 5V but worked when they were set at 3.3V, they pointed me at the documentation of threshold voltage, which turned out to be a very good lead.

Indeed, when connecting the CP2110 at 5V alone, Logic reports a high of 5.121V, and a low of ~-0.12V. When I tried to connect it with the CH341 through the breadboard full of connections, Logic reports a low of nearly 3V! And as far as I can tell, the ground is correctly wired together between the two serial adapters — they are even connected to the same USB HUB. I also don’t think the problem is with the wiring of the breadboard, because the behaviour is identical when just wiring the two adapters together.

So my next step has been setting up the BeagleBone Black I bought a couple of years ago and shelved into a box. I should have done that last year, and I would probably have been very close to have this working in the first place. After setting this up (which is much easier than it sounds), and figuring out from the BeagleBoard Wiki the pinout (and a bit of guesswork on the voltage) of its debug serial port, I could confirm the data was being sent to the CP2110 right — but it got all mangled on print.

The answer was that the HID buffered reads are… complicated. So instead of deriving most of the structure from the POSIX serial implementation, I lifted it from the RFC2217 driver, that uses a background thread to loop the reads. This finally allowed me to use the pySerial miniterm tool to log in and even dmesg(!) the BBB over the CP2110 adapter, which I consider a win.

Tomorrow I’ll try polishing the implementation to the point where I can send a pull request. And then I can actually set up to look back into the glucometer using it. Because I had an actual target when I started working on this, and was not just trying to get this to work for the sake of it.

Glucometerutils News: Continuous Integration, Dependencies, and Better Code

You may remember glucometerutils, my project of an open source Python tool to download glucometer data from meters that do not provide Linux support (as in, any of them).

While the tool started a few years ago, out of my personal need, this year there has been a bigger push than before, with more contributors trying the tool out, finding problem, fixing bugs. From my part, I managed to have a few fits of productivity on the tool, particularly this past week at 34C3, when I decided it was due time to start making the package shine a bit more.

So let’s see what are the more recent developments for the tool.

First of all, I decided to bring up the Python version requirement to Python 3.4 (previously, it was Python 3.2). The reason for this is that it gives access to the mock module for testing, and the enum module to write actual semantically-defined constants. While both of these could be provided as dependencies to support the older versions, I can’t think of any good reason not to upgrade from 3.2 to 3.4, and thus no need to support those versions. I mean, even Debian Stable has Python 3.5.

And while talking about Python versions, Hector pointed me at construct, which looked right away like an awesome library for dealing with binary data structures. It turned out to be a bit more rough around the edges than I’ve expected from the docs, particularly because the docs do not contain enough information to actually use it with proper dynamic objects, but it does make a huge difference compared to dealing with bytestring manually. I have started already while in Leipzig to use it to parse the basic frames of the FreeStyle protocol, and then proceeded to rewrite the other binary-based protocols, between the airports and home.

This may sound like a minor detail, but I actually found this made a huge difference, as the library already provides proper support for validating expected constants, as well as dealing with checksums — although in some cases it’s a bit heavier-handed than I expected. Also, the library supports defining bit structures too, which simplified considerably the OneTouch Ultra Easy driver, that was building its own poor developer version of the same idea. After rewriting the two binary LifeScan drivers I have (for the OneTouch Verio 2015 and Select Plus, and the Ultra Easy/Ultra Mini), the similarities between the two protocols are much easier to spot. Indeed, after porting the second driver, I also decided to refactor a bit on the first, to make the two look more alike.

This is going to be useful soon again, because two people have asked for supporting the OneTouch Verio IQ (which, despite the name, shares nothing with the normal Verio — this one uses an on-board cp210x USB-to-serial adapter), and I somehow expect that while not compatible, the protocol is likely to be similar to the other two. I ended up finding one for cheap on Amazon Germany, and I ended up ordering it — it would be the easier one to reverse engineer from my backlog, because it uses a driver I already known is easy to sniff (unlike other serial adapters that use strange framing, I’m looking at you FT232RL!), and the protocol is likely to not stray too far from the other LifeScan protocols, even though it’s not directly compatible.

I have also spent some time on the tests that are currently present. Unfortunately they don’t currently cover much of anything beside for some common internal libraries. I have though decided to improve the situation, if a bit slowly. First of all, I picked up a few of the recommendations I give my peers at work during Python reviews, and started using the parameterized module that comes from Abseil, which was recently released opensource by Google. This reduces tedious repetition when building similar tests to exercise different paths in the code. Then, I’m very thankful to Muhammad for setting up Travis for me, as that now allows the tests to show breakage, if there is any coverage at all. I’ll try to write more tests this month to make sure to exercise more drivers.

I’ve also managed to make the setup.py in the project more useful. Indeed it now correctly lists dependencies for most of the drivers as extras, and I may even be ready to make a first release on PyPI, now that I tested most of the devices I have at home and they all work. Unfortunately this is currently partly blocked on Python SCSI not having a release on PyPI itself. I’ll get back to that possibly next month at this point. For now you can install it from GitHub and it should all work fine.

As for the future, there are two in-progress pull requests/branches from contributors to add support for graphing the results, one using rrdtool and one using gnuplot. These are particularly interesting for users of FreeStyle Libre (which is the only CGM I have a driver for), and someone expressed interest in adding a Prometheus export, because why not — this is not as silly as it may sound, the output graphs of the Libre’s own software look more like my work monitoring graphs than the usual glucometer graphs. Myself, I am now toying with the idea of mimicking the HTML output that the Accu-Check Mobile generate on its own. This would be the easiest to just send by email to a doctor, and probably can be used as a basis to extend things further, integrating the other graphs output.

So onwards and upwards, the tooling will continue being built on. And I’ll do my best to make sure that Linux users who have a need to download their meters’ readings have at least some tooling that they can use, and that does not require setting up unsafe MongoDB instances on cloud providers.

glucometerutils news: many more meters, easier usage and Windows support

You probably have noticed by now that I write about glucometers quite a bit, not only reviewing them as an user, but also reverse engineering to figure out their protocols. This all started four years ago when I needed to send my glucometer readings to my doctor and I ended up having to write my own tool.

That tool started almost as a joke, particularly given I wrote it in Python, which at the time I was not an expert in at all (I have since learnt a lot more about it, and at work I got to be more of an expert than I’d ever expected to be). But I always known that it would be for the most part just a proof of concept. Not only exporting CSV is mostly useless, but the most important part in diabetes management software is the analysis and I don’t have any clue how to do analysis.

At first I thought I could reuse some of the implementation to expand Xavier’s OpenGlucose but it turned out that it’s not really easy for those meters that are using serial adapters or other USB devices beside the HID ones that he implemented already. Of course this does mean it would probably work fine for things like the FreeStyle Libre which I appear to have written the only Linux software to download from, but even in that case, things are more complicated.

Indeed, as I have noted here and there previously, we need a better format to export glucometer data, and in particular the data from continuous or mixed meters like the Libre. My current out format for it only includes the raw glucose readings from the meter that are not marked as errors; it does provide an unstructured text comment that tells you whether the reading is coming from the background sensor, an explicit scan or a blood sample, but it does not provide all the level of details of the original readings. And it does not expose ketone readings at all, despite the fact that most of the FreeStyle-line devices support them and I even documented how to get them. But this is a topic for a different post, I think.

On the other hand, over the past four years, the number of meters increased significantly, and I even have a few more that I only have partially reversed and not published yet. Currently there are 9 drivers, covering over a dozen meters (some meters share the same driver, either because they are just rebranded versions or simply because they share the same protocol). One is for the InsuLinx, which also loses a bunch of details, and is based off Xavier’s own reverse engineering — I did that mostly because all the modern FreeStyle devices appear to share the same basic protocol, and so writing new drivers for them is actually fairly trivial.

This would make the project an interesting base if someone feels like writing a proper UI for it. If I ever tried to look into that, I may end up just starting an HTTP server and provide everything over HMTL for the browser to render. After all that’s actually how OpenGlucose is doing things, except there is no server, and the browser is embedded. Alternatively one could just write an HTML report file out, the same way Accu-Chek Mobile does using data URLs and JavaScript bundles.

One of the most important usability changes I have added recently, though, is allowing the user not to specify the device path. When I started writing the tool, I started by looking at serial adapter based devices, which usually come with their own cable, and you just access it. The next driver was for the LBA-over-SCSI used int he OneTouch Verio, which I could have auto-detected but didn’t, and the following ones, mostly based off HID, I just expected to be given an hidraw path.

But all of this is difficult, and indeed I had more than a few people asking me which device are they meant to use, so over the past few months I adapter the drivers to try auto-detecting the devices. For the serial port based meters, the auto-detection targets the original manufacturer’s cable, so if you have a custom one, you should still pass them a path. For HID based devices, you also need the Python hidapi library because I couldn’t bother writing my own HID bus parsing on Linux…

… and the library also brings another important feature: it works on non-Linux operating systems. Indeed I now have not one but two confirmed users that managed to use the tool on Windows, for two separate FreeStyle devices (at the time of writing, the only ones implementing HID-based protocols, although I have another one in the pipeline.

Supposedly, all of this should work fine on macOS (I almost called it OS X), though the one person who contacted me trying to have it working there has been having trouble with it — I think the problem has something to do with the Python version available (I’m targetting Python 3 because I had a very hard time to do the right processing with Python 2.7). So if you want to give it a try feel free.

And yes, I’m very happy to receive pull request, so if you want to implement that HTML output I talked above about, you’re awesome and I’m looking forward to your pull request. I’m afraid I won’t be much help with the visualisation though.

Planets, feeds and blogs

You have probably noticed that last month I replaced Harvester with rawdog for Planet Multimedia. The reasons was easy to explain: Harvester requires libraries that only work with Ruby 1.8 — and while on one hand moving to Ruby 1.9 or 2 would mean being able to use the feedfetcher (the same one used by IFTTT), my attempts at updating the code to work with a more modern version of Ruby have been all failures.

Since I did not intend to be swamped with one more custom tool to maintain I turned to another standard tool to implement the aggregator, rawdog — holding my nose on the use of darcs for source control, and the name (please don’t google for it without safe search on, at work). The nice part about using this tool is that it’s packaged in Gentoo already, so it’s handled straight by portage with binary packages. Unfortunately, the default templates are terrible, and the settings non-obvious, but Luca was able to make the best out of it.

But more and more problems got obvious with time. The first is that the tool is does not respect the return codes at exit — it always returns zero (success) even if the processing was incomplete; it took me two weeks to figure out that the script failed when running in cron because the environment lacked the locale settings, as the cron logs said that everything was alright, and since I use fcron, it also did not send me any email, as I set it to mail me only for errors.

A couple of days ago, I got complains again that the Planet was not updating; again, no error in the cron logs, no error in my email. I ran the command manually, and I was told by it that Luca’s feed, on blogs.gentoo.org, was unreachable. Okay, sure. But then it did not solve itself when it came back up. Today I looked back into it and J-B’s and Rémi’s blogs feed were unreachable. Once again, no non-zero exit status, thus no mail, no error in the logs. This is not the way it should behave.

But that’s not enough. the other problem with rawdog is that it does not, by default, support generating a feed for the aggregation, like Harvester (and Planet/Venus) does. I found that Jonathan Riddell actually built a plugin for Planet KDE to generate the feed, but I haven’t tested it yet because I have not found the authoritative source of it, but just multiple copies of it in different websites. It also produces RSS feeds, rather than Atom feeds. And I’m sorry to say but Atom is much preferred, for me.

So where does it leave us? I’m not going to try fixing rawdog I’m afraid. Mostly because I don’t intend spending time with darcs. My options are either go back to Harvester and fix it to not use DBI and support Ruby 1.9, or try to adapt parts of NewsBlur – that already deal with aggregating feeds and producing new feeds – to make up an alternative to rawdog. If I am to do something like that, though, I’m most likely going to take my dear time and make it a web-configurable tool, rather than something that needs to be configured on the command line or with configuration files.

The reason for that is, very simply, that I’m growing fond of doing most of my work on a browser when I can, and this looks like a perfect solution to the problem. Even more so if you can give access to someone else to look into it — and if you can avoid storing passwords.

So, any takers to help me with this project?

Book Review: Learning Cython Programming

Thanks to PacktPub I got a chance to read an interesting book this week: Learning Cython Programming by Philip Herron (Amazon, Packt). I was curious because, as you probably noticed, after starting at my current place of work the past April, I ended up having to learn to use Python, which ended up with me writing my glucometer utilities in that language, contrarily to most of my other work, which has been done in Ruby. But that’s a topic for a different post.

First of all, I was curious about Cython; I heard the name before but never looked much into it, and when I saw the book’s title and I quickly checked what it was, my interest was definitely picked. If you haven’t looked into it either, at a quick summary it’s a code generator bridge between Python and good plain old C, wrapping the latter such that you can either make it run Python callbacks, or generate a shared object module that Python can load, and offload the computation-intensive code to a more performant language. And it looks a lot like a well-designed and well-implemented version of what I hoped to get in Ruby with Rust — no connection with Mozilla’s language with the same name.

The book is a quick starter, short and to the point, which is an approach I like. Together with the downloadable source code, it makes it a very good solution to learn Cython, and I recommend it if you’re interested. Not only it covers the obvious language itself, but it covers a wide range of use cases that show how to make good use of the options provided by Cython. It even goes on to show how to integrate it in a build system (although I have some reserves on the Autotools code in there, which I think I’ll send Philip a correction for).

I seriously wish I had Cython and this book when I was working on Hypnos, an Ultima OnLine «server emulator» for which I wanted to add Python-based scripting — other emulators at the time used either a very simple, almost basic-like scripting language, Perl or C#. This was before I tried to use Python for real, which turned me to hate its whitespace-based indentation. I did write some support for it but it was a long and tedious process, so I never finished it. Not only Cython would make that work much less tedious, but the book shows exactly how to add Python scripting capabilities to a bigger, C program using tmux as the example.

The book does not ignore the shortcomings of Cython of course, including the (quite clumsy) handling of exceptions when crossing the foreign language barrier. While there are still a bunch of issues to be straightened out, I think the book is fairly good at setting proper expectation for Cython. If you’re interested in the language, the book looks like the perfect fit.

Diabetes control and its tech, take 4: glucometer utilities

This is one of the posts I lost due to the blog problems with draft autosaving. Please bear with the possibly missing pieces that I might be forgetting.

In the previous post on the subject I pointed out that thanks to a post in a forum I was able to find how to talk with the OneTouch Ultra 2 glucometer I have (the two of them) — the documentation assumes you’re using HyperTerminal on Windows and thus does not work when using either picocom or PySerial.

Since I had the documentation from LifeScan for the protocol, starting to write an utility to access the device was the obvious next step. I’ve published what I have right now on a GitHub repository and I’m going to write a bit more on it today after a month of procrastination and other tasks.

While writing the tool, I found another issue with the documentation: every single line returned by the glucometer is ending with a four-digits (hex) checksum, but the documentation does not describe how the checksum is calculated. By comparing some strings with the checksum I knew, I originally guessed it might have been what I found called “CRC16-Syck” — unfortunately that also meant that the only library implementing it was a GPL-3 one, which clashed with my idea of a loose copyleft license for the tools.

But after introducing the checksum verification, I found out that the checksum does not really match. So more looking around with Google and in forums, and I get told that the checksum is a 16-bit variation of Fletcher’s checksum calculated in 32-bit but dropping the higher half… and indeed it would then match, but when then looking at the code I found out that “32-bit fletcher reduced to 16-bit” is actually “a modulo 16-bit sum of all the bytes”. It’s the most stupid and simple checksum.

Interestingly enough, the newer glucometers from LifeScan use a completely different protocol: it’s binary-based and uses a standard CRC16 implementation.

I’ve been doing my best to design the utility in such a way that there is a workable library as well as an utility (so that a graphical interface can be built upon it), and at the same time I tried making it possible to have multiple “drivers” that implement access to the glucometer commands. The idea is that this way, if somebody knows the protocol for other devices, they can implement support without rewriting, or worse duplicating, the tool. So if you own a glucometer and want to add support for it to my tool, feel free to fork the repository on GitHub and submit a merge request with the driver.

A final note I want to leave about possible Android support. I have been keeping in mind the option of writing an Android app to be able to dump the readings on the go. Hopefully it’s still possible to build Android apps for the market in Python, but I’m not sure about it. At the same time, there is a more important problem: even though I could connect my phone (Nexus 4) to the glucometer with an USB OTG cable and the one LifeScan sent me, but the USB cable has a PL2303 and I doubt that most Android devices would support it anyway.

The other alternative I can think about is to find an userland implementation of PL2303 that lets me access it as a serial port without the need for a kernel driver. If somebody knows of any software already made to solve this problem, I’ll be happy to hear.

I think I’ll keep away from Python still

Last night I ended up in Bizarro World, hacking at Jürgen’s gmaillabelpurge (which he actually wrote on my request, thanks once more Jürgen!). Why? Well, the first reason was that I found out that it hasn’t been running for the past two and a half months, because, for whatever reason, the default Python interpreter on the system where it was running was changed from 2.7 to 3.2.

So I tried first to get it to work with Python 3 keeping it working with Python 2 at the same time; some of the syntax changes ever so slightly and was easy to fix, but the 2to3 script that it comes with is completely bogus. Among other things, it adds parenthesis on all the print calls… which would be correct if it checked that said parenthesis wouldn’t be there already. In a script link the one aforementioned, the noise on the output is so high that there is really no signal worth reading.

You might be asking how comes I didn’t notice this before. The answer is because I’m an idiot! I found out only yesterday that my firewall configuration was such that postfix was not reachable from the containers within Excelsior, which meant I never got the fcron notifications that the job was failing.

While I wasn’t able to fix the Python 3 compatibility, I was able to at least understand the code a little by reading it, and after remembering something about the IMAP4 specs I read a long time ago, I was able to optimize its execution quite a bit, more than halving the runtime on big folders, like most of the ones I have here, by using batch operations, and peeking, instead of “seeing” the headers. At the end, I spent some three hours on the script, give or take.

But at the same time, I ended up having to workaround limitations in Python’s imaplib (which is still nice to have by default), such as reporting fetched data as an array, where each odd entry is a pair of strings (tag and unparsed headers) and each even entry is a string with a closed parenthesis (coming from the tag). Since I wasn’t able to sleep, at 3.30am I started re-writing the script in Perl (which at this point I know much better than I’ll ever know Python, even if I’m a newbie in it); by 5am I had all the features of the original one, and I was supporting non-English locales for GMail — remember my old complain about natural language interfaces? Well, it turns out that the solution is to use the Special-Use Extension for IMAP folders; I don’t remember this explanation page when we first worked on that script.

But this entry is about Python and not the script per-se (you can find on my fork the Perl version if you want). I have said before I dislike Python, and my feeling is still unchanged at this point. It is true that the script in Python required no extra dependency, as the standard library already covered all the bases … but at the same time that’s about it: it is basics that it has; for something more complex you still need some new modules. Perl modules are generally easier to find, easier to install, and less error-prone — don’t try to argue this; I’ve got a tinderbox that reports Python tests errors more often than even Ruby’s (which are lots), and most of the time for the same reasons, such as the damn unicode errors “because LC_ALL=C is not supported”.

I also still hate the fact that Python forces me to indent code to have blocks. Yes I agree that indented code is much better than non-indented one, but why on earth should the indentation mandate the blocks rather than the other way around? What I usually do in Emacs when I’m getting stuff in and out of loops (which is what I had to do a lot on the script, as I was replacing per-message operations with bulk operations), is basically adding the curly brackets in different place, then select the region, and C-M- it — which means that it’s re-indented following my brackets’ placement. If I see an indent I don’t expect, it means I made a mistake with the blocks and I’m quick to fix it.

With Python, I end up having to manage the space to have it behave as I want, and it’s quite more bothersome, even with the C-c < and C-c > shortcuts in Emacs. I find the whole thing obnoxious. The other problem is that, while Python does provide basics access to a lot more functionality than Perl, its documentation is .. spotty at best. In the case of imaplib, for instance, the only real way to know what’s going to give you, is to print the returned value and check with the RFC — and it does not seem to have a half-decent way to return the UIDs without having to parse them. This is simply.. wrong.

The obvious question for people who know would be “why did you not write it in Ruby?” — well… recently I’ve started second-guessing my choice of Ruby at least for simple one-off scripts. For instance, the deptree2dot tool that I wrote for OpenRC – available here – was originally written as a Ruby script … then I converted it a Perl script half the size and twice the speed. Part of it I’m sure it’s just a matter of age (Perl has been optimized over a long time, much more than Ruby), part of it is due to be different tools for different targets: Ruby is nowadays mostly a long-running software language (due to webapps and so on), and it’s much more object oriented, while Perl is streamlined, top-down execution style…

I do expect to find the time to convert even my scan2pdf script to Perl (funnily enough, gscan2pdf which inspired it is written in Perl), although I have no idea yet when… in the mean time though, I doubt I’ll write many more Ruby scripts for this kind of processing..