MSI X299 SLI PLUS problems and solutions

Last year, I posted about an issue with missing BitLocker and PIN authentication with my replacement Gamestation build. While it does not look like this is a particularly popular post, I did confirm that at least a couple of people managed to get good use out of that blog post.

As usual, my Twitter feed contains spoilers of this blog post, as I have ranted, complained, and asked questions (mostly to Jo) trying to figure out my Windows problems. The reason I’m writing this down is as usual as a reference to myself, so I don’t repeat the same mistakes over and over again, and as a reference for others, since particularly one of the error codes I’m going to talk about appears to find almost exclusively scammy “PC fixing” websites. And yes I know that I’m repeating the word BIOS later while this is clearly an UEFI board, but MSI calls it as such, and to be honest for most non-technical folks the differences between the two terms don’t exist.

All long help threads should have a sticky globally-editable post at the top saying ‘DEAR PEOPLE FROM THE FUTURE: Here’s what we’ve figured out so far …’

First of all, as noted in the previous post, it looks like nearly all of the settings in the BIOS are lost at any upgrade of the firmware. This is particularly annoying when it looks like a lot of the updates are early boot microcode updates to cover the increasing complexity of mitigating Spectre-style vulnerabilities, and reasonably shouldn’t need to change the semantics or format of settings such as Secure Boot, TPM settings, or smart fan configuration.

So make sure to take good screenshots of all your settings before updating your firmware, as otherwise you’ll fight for hours trying to reconfigure it as you had it before.

Your computer is not resuming from sleep when you press the power button. This appears to be common, I’ve found a bunch of forums posts by people complaining about this behaviour on a number of MSI motherboards. Most of them appears to be in the form of DenverCoder9, although with a little more detail: people claiming they solved the issue by either downgrading or upgrading the motherboard’s BIOS. Not wanting to downgrade my BIOS and having just upgraded it, I wanted to find a better answer, and turns out I probably did find it. Here’s the solution: disable GO2BIOS feature.

Some more details, which can be useful for others in the future if they encounter similar issues and the solution I’m providing is not helping them. The GO2BIOS feature by MSI is a shortcut to enter the BIOS configuration screen without using the keyboard, and it’s particularly handy once you enable all the fast-boot options, as the keyboard might not respond at all. To force entering the BIOS configuration, then, you just need to keep pressed the power button for four seconds when you turn on the computer. That’s what clued me to the connection between the setting and the failure to resume, as they both related to the power button.

The reason why downgrading or upgrading the BIOS appeared to solve the issue is the one I noted above: all firmware updates on these boards appear to completely reset the settings to defaults, and the GO2BIOS feature is not enabled by default (and probably few people would consider re-enabling it in the hurry.)

Windows 10 bluescreens with WHEA_UNCORRECTABLE_ERROR. This is trickier, mostly because all of the search hits for this particular code appears to point at very dodgy websites, and the only hit I could find on the Microsoft website was for a forum post where it was suggested that the particular code I was saying was related to AMD CPUs. Since my machine is an i7, that made no sense whatsoever.

The WHEA in the name stands for Windows Hardware Error Architecture, which suggested that the cause of the bluescreen is caused by something like a Machine-Check Exception. This was particularly scary because it started happening right after I installed a new NVMe SSD, which appeared to get very warm, leading me to first install two more fans, and then replacing the original fans with PWM ones.

During this “ordeal” I also had been installing and updating quite a few pieces of software, related to CPU, motherboard, the Kraken cooler, and so on. And since I had just updated the BIOS I also had been tweaking a lot of parameters around, including tried re-enabling the auto-over-clock feature that, as I discussed previously, appears to be implemented mostly in firmware.

Eventually, I found that I solved the problem by uninstalling MSI’s Control Center software. I had already previously disabled the OC assistant, but even with that I kept receiving random blue screens when browsing websites, or just opening Lightroom. Since I uninstalled the Control Center software I have not experienced a single one for a few days. And that including a “torture test” with Prime95 that brought the CPU to 100C and to thermal throttling.

I’m not sure what the root cause for this is. I can only imagine that there’s some strange interaction between the firmware and the software that was not quite well tested. Or maybe there’s a new update on Windows 10 that caused Control Center to fight for resources. But whatever the reason it seems the right thing to do was to remove MSI’s software, which anyway does not really do anything you can’t do in the BIOS configuration screen.

I hope this post can find its way to those looking for answers for these (or similar enough) issues. And if you find that there are other possible causes for this, feel free to leave a comment on the post.

Windows 10: what to do if BitLocker and PIN stop working after update

I don’t really like the idea of having to write about proprietary software here, but I only found terrible alternative suggestions on the eb so I thought I would at least try to write down about it in the hope to avoid people falling for very bad advice.

The problem: after updating my BIOS, BitLocker asks for the key at boot, and PIN login for Windows 10 (Microsoft Account) fails, requiring to log in with the full Microsoft account password. Trying to re-enable the PIN fails with the error message “Sorry, there was a problem signing you in”.

The amount of misleading information I found on the Internet is astonishing, including a phrase from what appeared to be a Microsoft support person, stating «The operating system should always go with the BIOS. If the BIOS is freshly updated, then the OS has to be fresh as well.» Facepalms all over the place.

The solution (for me): go back in the BIOS and re-enable the TPM (“Security Module”).

Some background is needed. The 2017 Gamestation I’m using nowadays is built using a MSI X299 SLI PLUS with a plug-in TPM which is a requirement to use BitLocker (and if you think that makes it very safe, think again).

I had just updated the firmware of the motherboard (that at this point we all still call “BIOS” despite being clearly “UEFI” based), and it turns out that MSI just silently drop most of the customization to the BIOS settings after update. In particular this disabled a few required settings, including the TPM itself (and Secure Boot — I wonder if Matthew Garrett would have some words about the implementation of it in this particular board at this point).

I see reports on this for MSI and Gigabyte boards alike, so I can assume that Gigabyte does the same, and requires re-enabling the TPM in the settings when updating the BIOS version.

I would probably say that the MSI firmware engineering does not quite fully convince me. It’s not just the dropping all the settings on update (which I still find is a bad thing to do for users), but also the fact that one of the settings is explicitly marked as “crypto miner mode” — I’m sure looking forward for the time after the bubble bursts so that we don’t have to pay premium for decent hardware just because someone thinks they can make money from nothing. Oh well.