Don’t Ignore Windows 10 as a Development Platform for FLOSS

Important Preface: This blog post was written originally on 2020-05-12, and scheduled for later publication, inspired by this short Twitter thread. As such it well predates Microsoft’s announcement of expanding support of WSL2 to graphical apps. I considered trashing, or seriously re-editing the blog post in the light of the announcement, but I honestly lack the energy to do that now. It left a bad taste in my mouth to know that it will likely get drowned out in the noise of the new WSL2 features announcement.

Given the topic of this post I guess I need to add a preface to point out my “FLOSS creds” — because I have seen already too many attacks to people who even use Windows at all. I have been an opensource developer for over fifteen years now, and part of the reason why I left my last bubble was because it made it difficult for me to contribute to various opensource projects. I say this because I’m clearly a supporter of Free Software and Open Source, wherever possible. I also think that’s different people have different needs, and that ignoring that is a failure of the FLOSS movement as a whole.

The “Year of Linux on the Desktop” is now a meme that has been running its course to the point of being annoying. Despite what FLOSS advocates keep saying, “Linux on the Desktop” is not really moving, and while I do have some strong opinions on this, that’s for another day. Most users, and in particular newcomers to FLOSS (both as users and developers) are probably using a more “user friendly” platform — if you leave a comment with the joke on UNIX being selective with its friends, you’ll end up on a plonkfile, be warned.

About ten years ago, it seemed like the trend was for FLOSS developers to use MacBooks as their daily laptops. I did that for a while myself — an UNIX-based platform with all the tools of the trade, which allowed quite a bit of work being done without having access to a Linux platform. SSH, Emacs, GCC, Ruby, and so on. And at the same time, you had the stability of Mac OS X, with the battery life and all the hardware worked great out of the box. But then more recently, Apple’s move towards “walled gardens” seemed to be taking away from this feasibility.

But back to the main topic. Over the past many years, I’ve been using a “mixed setup” — using a Linux laptop (or more recently desktop) for development, and a Windows (7, then 10) desktop for playing games, editing photos, designing PCBs, and for logic analysis. The latter is because Saleae Logic takes a significant amount of RAM when analysing high-frequency signals, and I have been giving my gamestations as much RAM as I can just for Lightroom, so it makes sense to run it on the machine with 128GB of RAM.

But more recently I have been exploring the ability of using Windows 10 as a development platform. In part because my wife has been learning Python, and since also learning a new operating system and paradigm at the same time would have been a bloody mess, she’s doing so on Windows 10 using Visual Studio Code and Python 3 as distributed through the Microsoft Store. While helping her, I had exposure to Windows as a Python development platform, so I gave it a try when working on my hack to rename PDF files, which turned out to be quite okay for a relatively simple workflow. And the work on the Python extension keeps making it more and more interesting — I’m not afraid to say that Visual Studio Code is better integrated with Python than Emacs, and I’m a long-time user of Emacs!

In the last week I have actually stepped up further how much development I’m doing on Windows 10 itself. I have been using HyperV virtual machines for Ghidra, to make use of the bigger screen (although admittedly I’m just using RDP to connect to the VM so it doesn’t really matter that much where it’s running), and in my last dive into the Libre 2 code I felt the need to have a fast and responsive editor to go through executing part of the disassembled code to figure out what it’s trying to do — so once again, Visual Studio Code to the rescue.

Indeed, Windows 10 now comes with an SSH client, and Visual Studio Code integrates very well with it, which meant I could just edit the files saved in the virtual machine and have the IDE also build them with GCC and executing them to get myself an answer.

Then while I was trying to use packetdiag to prepare some diagrams (for a future post on the Libre 2 again), I found myself wondering how to share files between computers (to use the bigger screen for drawing)… until I realised I could just install the Python module on Windows, and do all the work there. Except for needing sed to remove an incorrect field generated in the SVG. At which point I just opened my Debian shell running in WSL, and edited the files without having to share them with anything. Uh, score?

So I have been wondering, what’s really stopping me from giving up my Linux workstation for most of the time? Well, there’s hardware access — glucometerutils wouldn’t really work on WSL unless Microsoft is planning a significant amount of compatibility interfaces to be integrated. Similar for using hardware SSH tokens — despite PC/SC being a Windows technology to begin with. Screen and tabulated shells are definitely easier to run on Linux right now, but I’ve seen tweets about modern terminals being developed by Microsoft and even released FLOSS!

Ironically, I think it’s editing this blog that is the most miserable experience for me on Windows. And not just because of the different keyboard (as I share the gamestation with my wife, the keyboard is physically a UK keyboard — even though I type US International), but also because I miss my compose key. You may have noticed already that this post is full of em-dashes and en-dashes. Yes, I have been told about WinCompose, but last time I tried using it, it didn’t work and even screwed up my keyboard altogether. I’m now trying it again, at least on one of my computers, and if it doesn’t explode in my face again, I may just give it another try later.

And of course it’s probably still not as easy to set up a build environment for things like unpaper (although at that point, you can definitely run it in WSL!), or to have a development environment for actual Windows applications. But this is all a matter of different set of compromises.

Honestly speaking, it’s very possible that I could survive with a Windows 10 laptop for my on-the-go opensource work, rather than the Linux one I’ve been using. With the added benefit of being able to play Settlers 3 without having to jump through all the hoops from the last time I tried. Which is why I decided that the pandemic lockdown is the perfect time to try this out, as I barely use my Linux laptop anyway, since I have a working Linux workstation all the time. I have indeed reinstalled my Dell XPS 9360 with Windows 10 Pro, and installed both a whole set of development tools (Visual Studio Code, Mu Editor, Git, …) and a bunch of “simple” games (Settlers, Caesar 3, Pharaoh, Age of Empires II HD); Discord ended up in the middle of both, since it’s actually what I use to interact with the Adafruit folks.

This doesn’t mean I’ll give up on Linux as an operating system — but I’m a strong supporter of “software biodiversity”, so the same way I try to keep my software working on FreeBSD, I don’t see why it shouldn’t work on Windows. And in particular, I always found that providing FLOSS software on Windows a great way to introduce new users to the concept of FLOSS — focusing more on providing FLOSS development tools means giving an even bigger chance for people to build more FLOSS tools.

So is everything ready and working fine? Far from it. There’s a lot of rough edges that I found myself, which is why I’m experimenting with developing more on Windows 10, to see what can be improved. For instance, I know that the reuse-tool has some rough edges with encoding of input arguments, since PowerShell appears to still not default to UTF-8. And I failed to use pre-commit for one of my projects — although I have not taken notice yet much of what failed, to start fixing it.

Another rough edge is in documentation. Too much of it assumes only a UNIX environment, and a lot of it, if it has any support for Windows documentation at all, assumes “old school” batch files are in use (for instance for Python virtualenv support), rather than the more modern PowerShell. This is not new — a lot of times modern documentation is only valid on bash, and if you were to use an older operating system such as Solaris you would find yourself lost with the tcsh differences. You can probably see similar concerns back in the days when bash was not standard, and maybe we’ll have to go back to that kind of deal. Or maybe we’ll end up with some “standardization” of documentation that can be translated between different shells. Who knows.

But to wrap this up, I want to give a heads’ up to all my fellow FLOSS developers that Windows 10 shouldn’t be underestimated as a development platform. And that if they intend to be widely open to contributions, they should probably give a thought of how their code works on Windows. I know I’ll have to keep this in mind for my future.

Windows Backup Solutions: trying out Acronis True Image Backup 2020

One of my computers is my Gamestation, which to be honest has not ran a game in many months now, which runs Windows out of necessity, but also because honestly sometimes I just need something that works out of the box. The main usage of that computer nowadays is Lightroom and Photoshop for my photography hobby.

Because of the photography usage, backups are a huge concern to me (particularly after movers stole my previous gamestation), and so I have been using a Windows tool called FastGlacier to store a copy of most of the important stuff to Amazon Glacier service, in addition to letting Windows 10 do its FileHistory magic on an external hard drive. Not a cheap option, but (I thought) a safe and stable one. Unfortunately the software appears to not being developed anymore, and with one of the more recent Windows 10 updates it stopped working (and since I had set it up as a scheduled operation, it failed silently, which is the worst thing that can happen!)

My original plan for last week (at the time of writing), was to work on pictures, as I have shots from trip over three years ago that I have still not wandered through, rather than working on reverse engineering. But when I noticed the lacking backups, I decided to put that on hold until the backup problem was solved. The first problem was finding a backup solution that would actually work, and that wouldn’t cost an arm and a leg. The second problem was that of course most of the people I know are tinkerers that like rube-goldberg solutions such as using rclone on Windows with the task scheduler (no thanks, that’s how I failed the Glacier backups).

I didn’t have particularly high requirements: I wanted a backup solution that would do both local and cloud backups — because Microsoft has been reducing the featureset of their FileHistory solution, and so relying on it feels a bit flaky. And the ability to store more than a couple of terabytes on the cloud solution (I have over 1TB of RAW shots!), even at a premium. I was not too picky on price, as I know features and storage are expensive. And I wanted something that would just work out of the box. A few review reads later, I found myself trying Acronis True Image Backup. A week later, I regret it.

I guess my best lesson learnt from this is that Daniel is right, and it’s not just about VPNs: most review sites seem to be scoring higher the software they get more money from via affiliate links (you’ll notice that in this blog post there won’t be any!) So while a number of sites had great words for Acronis’s software, I found it sufficiently lacking that I’m ranting about it here.

So what’s going on with the Acronis software? First of all, while it does support both “full image” and “selected folders” modes, you need to be definitely aware that the backup is not usable as-is: you need the software to recover the data. Which is why it comes with bootable media, “survival kits”, and similar amenities. This is not a huge deal to me, but it’s still a bit annoying, when FileHistory used to allow direct access to the files. It also locks you in in accessing the backup with the software, although Acronis makes the restore option available even after you let your subscription expire, which is at least honest.

Then the next thing that was clear to me was that the speed of the cloud backup is not the strongest suit of Acronis. The original estimate for backing up the 2.2TB of data that I expected to back up was on the mark at nearly six days. To be fair to Acronis, the process went extremely smoothly, it never got caught up, looped, crashed, or slowed down. The estimate was very accurate, and indeed, running this for about 144 hours was enough to have the full data backed up. Their backup status also shows the average speed of the processes, that matched my estimate while the backup was running, of 50Mbps.

The speed is the first focus of my regret. 50Mbps is not terribly slow, and for most people this might be enough to saturate their Internet uplink. But not for me. At home, my line is provided by Hyperoptic, with a 1Gbps line that can sustain at least 900Mbps upload. So seeing the backup bottlenecked by this was more than a bit annoying. And as far as I can tell, there’s no documentation of this limit on the Acronis website at the time of writing.

When I complained on Twitter about this, it was mostly in frustration for having to wait, but I was considering the 50Mbps speed at least reasonable (although I would have considered paying a premium for faster uploads!) the replies I got from support have gotten me more upset than before. Their Twitter support people insisted that the problem was with my ISP and sent me to their knowledgebase article on using the “Acronis Cloud Connection Verification Tool” — except that following the instruction showed I was supposed to be using their “EU4” datacenter, for which there is no tool. I was then advise to file a ticket about it. Since then, I appear to have moved back to “EU3” — maybe EU4 was not ready yet.

The reply to the ticket was even more of an absurdist mess. Beside a lot of words to explain “speed is not our fault, your ISP may be limiting your upload” (fair, but I already noted to them that I knew that was not the case), one of the steps they request you to follow is to go to one of their speedtest apps — which returns a 504 error from nginx, oops! Oh yeah and you need to upload the logs via FTP. In 2020. Maybe I should call up Foone to help. (Windows 10, as it happens, still supports FTP write-access via File Explorer, but it’s not very discoverable.)

Both support people also kept reminding me that the backup is incremental. So after the first cloud backup, everything else should be a relatively small amount of data to be copied. Except that I’m not sold onto that either, still: 128GB of data (which is the amount of pictures I came back from Budapest with), would take nearly six hours to back up.

When I finally managed to get a reply that was not directly from a support script, they told me to run the speedtest on a different datacenter, EU2. As it turns out, this is their “Germany” datacenter. This was very clear by tracerouting the IP addresses for the two hosts: EU3 is connected directly to LINX, EU2 goes back to AMS, then FRA (Frankfurt). The speedtest came out fairly reasonable (around 250Mbps download, 220Mbps upload), so I shared the data they requested in the ticket… and then wondered.

Since you can’t change the datacenter you backup to once you started a backup, I tried something different: I used their “Archive” feature, and tried to archive a multi-gigabyte file, but to their Germany datacenter, rather than the United Kingdom one (against their recommendation of «select the country that is nearest to your current location»). Instead of a 50Mbps peak, I got a 90Mbps peak, with a sustained of 67Mbps. Now this is still not particularly impressive, but it would have cut down the six days to three, and the five hours to around two. And clearly it sounds like their EU3 datacenter is… not good.

Anyway, let’s move on and look at local backups, which Acronis is supposed to take care of by itself. For this one at first I wanted to use the full image backup, rather than selecting folders like I did for the cloud copy, since it would be much cheaper, and I have a 9T external harddrive anyway… and when you do that, Acronis also suggests you to create what they call the “Acronis Survival Kit” — which basically means turning the external hard drive bootable, so that you can start up and restore the image straight from it.

The first time I tried setting it up that way, it formatted the drive, but it didn’t even manage to get Windows to connect the new filesystem. I got an error message linking me to a knowledgebase article that… did not exist. This is more than a bit annoying, but I decided to run a full SMART check on the drive to be safe (no error to be found), and then try again after a reboot. Then it finally seemed to work, but here’s where things got even more hairy.

You see, I’ve been wanting to use my 9TB external drive for the backup. A full image of my system was estimated at 2.6TB. But after the Acronis Survival Kit got created, the amount of space available for the backup on that disk was… 2TB. Why? It turned out that the Kit creation caused the disk to be repartitioned as MBR, rather than the more modern GPT. And in MBR you can’t have a (boot) partition bigger than 2TB. Which means that the creation of the Survival Kit silently decreased my available space to nearly 1/5th!

The reply from Acronis on Twitter? According to them my Windows 10 was started in “BIOS mode”. Except it didn’t. It’s set up with UEFI and Secure Boot. And unfortunately it doesn’t seem like there’s an easy way to figure out why the Acronis software thinks it’s that way. But worse than that, the knowledgebase article says that I should have gotten a warning, which I never did.

So what is it going to be at the end of the day? I tested the restore from Acronis Cloud, and it works fine. Acronis has been in business for many years, so I don’t expect them to disappear next year. So the likeliness of me losing access to these backups is fairly low. I think I may just stick to them for the time being, and hope that someone in the Acronis engineering or product management teams can read this feedback and think about that speed issue, and maybe start considering the idea of asking support people to refrain from engaging with other engineers on Twitter with fairly ridiculous scripts.

But to paraphrase a recent video by Techmoan, these are the type of imperfections (particularly the mis-detected “BIOS booting” and the phantom warning), that I could excuse to a £50 software package, but that are much harder to excuse in a £150/yr subscription!

Any suggestions for good alternatives to this would be welcome, particularly before next year, when I might reconsider if this was good enough for me, or a new service is needed. Suggestions that involve scripts, NAS, rclone, task scheduling, self-hosted software will be marked as spam.

MSI X299 SLI PLUS problems and solutions

Last year, I posted about an issue with missing BitLocker and PIN authentication with my replacement Gamestation build. While it does not look like this is a particularly popular post, I did confirm that at least a couple of people managed to get good use out of that blog post.

As usual, my Twitter feed contains spoilers of this blog post, as I have ranted, complained, and asked questions (mostly to Jo) trying to figure out my Windows problems. The reason I’m writing this down is as usual as a reference to myself, so I don’t repeat the same mistakes over and over again, and as a reference for others, since particularly one of the error codes I’m going to talk about appears to find almost exclusively scammy “PC fixing” websites. And yes I know that I’m repeating the word BIOS later while this is clearly an UEFI board, but MSI calls it as such, and to be honest for most non-technical folks the differences between the two terms don’t exist.

All long help threads should have a sticky globally-editable post at the top saying ‘DEAR PEOPLE FROM THE FUTURE: Here’s what we’ve figured out so far …’

First of all, as noted in the previous post, it looks like nearly all of the settings in the BIOS are lost at any upgrade of the firmware. This is particularly annoying when it looks like a lot of the updates are early boot microcode updates to cover the increasing complexity of mitigating Spectre-style vulnerabilities, and reasonably shouldn’t need to change the semantics or format of settings such as Secure Boot, TPM settings, or smart fan configuration.

So make sure to take good screenshots of all your settings before updating your firmware, as otherwise you’ll fight for hours trying to reconfigure it as you had it before.

Your computer is not resuming from sleep when you press the power button. This appears to be common, I’ve found a bunch of forums posts by people complaining about this behaviour on a number of MSI motherboards. Most of them appears to be in the form of DenverCoder9, although with a little more detail: people claiming they solved the issue by either downgrading or upgrading the motherboard’s BIOS. Not wanting to downgrade my BIOS and having just upgraded it, I wanted to find a better answer, and turns out I probably did find it. Here’s the solution: disable GO2BIOS feature.

Some more details, which can be useful for others in the future if they encounter similar issues and the solution I’m providing is not helping them. The GO2BIOS feature by MSI is a shortcut to enter the BIOS configuration screen without using the keyboard, and it’s particularly handy once you enable all the fast-boot options, as the keyboard might not respond at all. To force entering the BIOS configuration, then, you just need to keep pressed the power button for four seconds when you turn on the computer. That’s what clued me to the connection between the setting and the failure to resume, as they both related to the power button.

The reason why downgrading or upgrading the BIOS appeared to solve the issue is the one I noted above: all firmware updates on these boards appear to completely reset the settings to defaults, and the GO2BIOS feature is not enabled by default (and probably few people would consider re-enabling it in the hurry.)

Windows 10 bluescreens with WHEA_UNCORRECTABLE_ERROR. This is trickier, mostly because all of the search hits for this particular code appears to point at very dodgy websites, and the only hit I could find on the Microsoft website was for a forum post where it was suggested that the particular code I was saying was related to AMD CPUs. Since my machine is an i7, that made no sense whatsoever.

The WHEA in the name stands for Windows Hardware Error Architecture, which suggested that the cause of the bluescreen is caused by something like a Machine-Check Exception. This was particularly scary because it started happening right after I installed a new NVMe SSD, which appeared to get very warm, leading me to first install two more fans, and then replacing the original fans with PWM ones.

During this “ordeal” I also had been installing and updating quite a few pieces of software, related to CPU, motherboard, the Kraken cooler, and so on. And since I had just updated the BIOS I also had been tweaking a lot of parameters around, including tried re-enabling the auto-over-clock feature that, as I discussed previously, appears to be implemented mostly in firmware.

Eventually, I found that I solved the problem by uninstalling MSI’s Control Center software. I had already previously disabled the OC assistant, but even with that I kept receiving random blue screens when browsing websites, or just opening Lightroom. Since I uninstalled the Control Center software I have not experienced a single one for a few days. And that including a “torture test” with Prime95 that brought the CPU to 100C and to thermal throttling.

I’m not sure what the root cause for this is. I can only imagine that there’s some strange interaction between the firmware and the software that was not quite well tested. Or maybe there’s a new update on Windows 10 that caused Control Center to fight for resources. But whatever the reason it seems the right thing to do was to remove MSI’s software, which anyway does not really do anything you can’t do in the BIOS configuration screen.

I hope this post can find its way to those looking for answers for these (or similar enough) issues. And if you find that there are other possible causes for this, feel free to leave a comment on the post.

Windows 10: what to do if BitLocker and PIN stop working after update

I don’t really like the idea of having to write about proprietary software here, but I only found terrible alternative suggestions on the eb so I thought I would at least try to write down about it in the hope to avoid people falling for very bad advice.

The problem: after updating my BIOS, BitLocker asks for the key at boot, and PIN login for Windows 10 (Microsoft Account) fails, requiring to log in with the full Microsoft account password. Trying to re-enable the PIN fails with the error message “Sorry, there was a problem signing you in”.

The amount of misleading information I found on the Internet is astonishing, including a phrase from what appeared to be a Microsoft support person, stating «The operating system should always go with the BIOS. If the BIOS is freshly updated, then the OS has to be fresh as well.» Facepalms all over the place.

The solution (for me): go back in the BIOS and re-enable the TPM (“Security Module”).

Some background is needed. The 2017 Gamestation I’m using nowadays is built using a MSI X299 SLI PLUS with a plug-in TPM which is a requirement to use BitLocker (and if you think that makes it very safe, think again).

I had just updated the firmware of the motherboard (that at this point we all still call “BIOS” despite being clearly “UEFI” based), and it turns out that MSI just silently drop most of the customization to the BIOS settings after update. In particular this disabled a few required settings, including the TPM itself (and Secure Boot — I wonder if Matthew Garrett would have some words about the implementation of it in this particular board at this point).

I see reports on this for MSI and Gigabyte boards alike, so I can assume that Gigabyte does the same, and requires re-enabling the TPM in the settings when updating the BIOS version.

I would probably say that the MSI firmware engineering does not quite fully convince me. It’s not just the dropping all the settings on update (which I still find is a bad thing to do for users), but also the fact that one of the settings is explicitly marked as “crypto miner mode” — I’m sure looking forward for the time after the bubble bursts so that we don’t have to pay premium for decent hardware just because someone thinks they can make money from nothing. Oh well.