Reverse Engineering an LG Aircon Control Panel — Fit It All Together

Since we moved to this flat, one of my wishlist item was to have a way to control the HVAC (heat, ventilation, and air conditioning) without having to get out of bed, or go to a different room. A few months ago, I started on a relatively long journey of reverse engineering the protocol that the panel used so that I could build my own controller, and I’ve been documenting pretty much every step along the way, either on Twitter, this blog, or Twitch. As I type this, while I can’t say that the project is all done and dusted, I’ve at least managed to get to the point where I’m reaping the benefits of this journey.

You may remember that at the end of the previous chapter in this saga I was looking at controlling the actual HVAC with a Python command line tool. I built that on stream, to begin with, and while I didn’t manage to make it work the way I was hoping (with a curses-based UI), I did manage to get something that worked to emulate both the panel and (to a minor point) the HVAC itself.

I actually used this “in production” (that is, to control the air conditioning in my home office) for a little over a week. The way I was using it was through a Beagle Bone Black which I had laying around for a long while – I regret not signing up for the RISC-V preview, but honestly I wouldn’t have had time – connected through an USB-to-UART adapter (a CH340 based one, because they can actually do 104bps!), and a breadboarded TLIN1206 on a SOP-8-to-DIP adapter. A haphazard and janky setup if you’d ever seen one, but it was controllable by phone… with JuiceSSH and Tailscale.

While absolutely not ergonomic, this setup still allowed me to gain the experience needed with the protocol, in order to send the PCB design to manufacture. I did that about a week in, and as usual, JLCPCB‘s turnaround is phenomenally fast, and by the following Monday I had my boards.

The design is pretty much the same as the one I spoke about in part two, with two bus transceiver blocks, a 3.3V DC-DC converter, the quad-NAND to make sure the code cannot send data to the bus if the physical switch is turned to “disable”.

While the design proven it needs a little bit more work to be optimal, it’s a great starting point because it just works fine. And while I did originally plan to have this support the original panel with the switch turning on “pass-through mode”, I decided that for the moment, this is not needed, nor desired.

My original intention was to allow the physical switch to just prevent the custom controller from talking to the HVAC engine, and let the original panel take over. Unfortunately this does not work: the panel is not entirely stateful, but there is a bit in the command packet that says “Listen to me, I’m changing configuration” — and if the configuration in the panel and the HVAC don’t match when that bit is not set, an error state is introduced.

This basically means I wouldn’t be able to seamlessly switch between the old panel and my custom controller. Instead what I’m going to be doing if I need to is to first flip the switch to disable the custom logic, and then connect the panel to the secondary bus. That way it would initialize directly on the bus. If I do need to re-design this board, I’m going to make the switch more useful, and add a power disconnection so that it can be all connected without power on at all.

There was another reason why I originally planned to support the original control panel: it reports a temperature. I thought that I would be able to use that temperature as a “default” sensor, while still allowing me to change the source of the temperature at the time of configuration. But the panel’s temperature sensor is pretty much terrible, and it’s only able to measure 16°C~30°C, plus it’s easily fooled by… a lamp. So not exactly something you can call reliable for this use, and not something I would care to add to my monitoring.

To my own astonishment, my first attempt at soldering the full board was successful — two out of three of the boards work like a charm, the third one is a bit iffy, but it might be a not perfect component into it. I’ll have to see, but also I don’t want to make a new respin now that even some of the components I need are getting harder (and more expensive) to find. Take the JST ZR connection: in the picture above you an see it white rather than cream — that’s because it’s not an official JST part but an AliExpress clone that fits the same footprint, and I could actually buy.

Custom ESPHome Climate

Once I had the board, I rigged up a quick test bed on my desk with a breadboard and another CH340 adapter (I have a few around, they are fairly cheap and fairly versatile), and started off to complete the Custom Climate component.

Since I started this project, the ESPHome Climate API actually changed a couple of times, particularly as Nabu Casa is now sponsoring the project, and its development moved to a monthly release kind of deal. Somehow the Climate components were among those that most required work.

But the end result is that the API was clearer, and actually easier to implement, by the time I had this ready. So I wrote down the code to generate the six bytes packets I needed to send, and ran it against the emulator… and it seemed to work fine. I had wired this with a test component in Home Assistant and I could easily change the mode, the temperature, and everything else just fine, and I could see in my emulator that it was getting the right data in just fine.

At that point I was electrified, and I thought it would just be a matter of putting it to the wall and see it work. You can imagine my disappointed when I called in my wife to assist me in my victory… and it didn’t work. And I was ready to detach all of it to spend the next day debugging what was going on, until I realized that I forgot to send the “settings changed” flag. I have to say that the protocol turns out to be a bit more complex than I expected it to be, and I should probably write a bit more documentation about it, not just scatter it around the (now multiple) implementations.

After that, I actually went ahead and replaced all three of the control panels with my custom ones, and connected them to Home Assistant. That turned out to be a much easier prospect than anything else we have been trying up to now: we can decide which temperature reading to use to control the room, rather than being stuck to the silly temperature sensor in the panel, and we can use the two-point set temperature in the way most modern smart thermostat can (which the panel didn’t support, despite having an “auto mode” that would turn on cooling or heating “as needed”).

The first couple of days lead to a bit of adjustments being necessary — including implementing a feature that my wife requested: when not cooling or heating, the original panel would enter “fan only” mode. Which I enjoy for myself in the office, but bothers my wife. The original panel does not have an option to turn the fan off — but I could implement that in the custom controller. This allows us to keep our thermostat on the heat-cool mode most of the time, and just make sure the range is what we actually care for.

I also made the mistake at first of not counting on hysteresis. That turned out to be a bit more annoying to implement, not in the matter of code, but in the matter of the logic behind it — but it should now be working: it means that there is more friction to change the state of the air conditioning, which means the temperature is not as constant, but it should be significantly easier to run. To be honest I was impressed by how stable the temperature was when I left it to short cycle…

Home Assistant Integration

This was probably the simplest part of the whole work! Nabu Casa is doing an awesome job at keeping the two projects integrated very well, and with the help of “packages” configuration, replicating the configuration for the three separate boards took basically no time.

The only problem I had was that I couldn’t seem to be able to flash the first ESPHome firmware onto my ESP32 devkits using the WebSerial support. I have used it multiple times in the past, particularly to update my BLE Bridge, which I just need to connect to my main workstation rather than the standalone power supply, to upgrade, but for the pristine ESP32 devkits it didn’t work out quite as well.

The UI is very similar to the one that Google Home exposes for Nest thermostats where they do support air conditioning. And indeed, with the addition of the Home Assistant Cloud service, the same UI shows up for these thermostats.

And at that point it was just a matter of configuring the expected range of operation, both for the “daytime” and for the “night scene”. Which is one of the reasons why I wanted to have thermostat that we can control with Home Assistant.

You see, some time ago I set up so that we have a routine phrase (“To the bedroom”) and Flic buttons in both the living room and the bedroom, that prepare us to get to bed: turn off the TV, subwoofer, air fresheners, lights everywhere except bedroom and bathroom, set the volume of the bedroom speaker for the relaxing night sound, and so on.

Recently, thanks to a new Dyson integration, it also been setting the humidifier to raise the humidity in the bedroom (it gets way too dry!), and turn on the night mode on the other purifiers, which has been a great way not just to make it easier not to forget things, but also to save us from leaving the humidifier running 24/7: it’s easier for us to keep it running overnight at a higher humidity, than trying to keep it up during the whole day.

Now, with the climate controls in place, we can also change the temperatures before going to bed, rather than turning it off, which is the only option we had before. And this is a big deal, because particularly for the living room, we don’t want it to get too scorching hot even if we’re not there: it’s where the food cupboards, among other things, and during the heatwave we exceeded the 30°C for a couple of hours every other day. Being able to select different ranges while we’re still sleeping gives us a bit more safety, without having to keep running the air conditioning overnight.

Features And Remote Control

Similarly, our scheduled morning routine, configured to go off together with the alarm clock, can scale back the range in my home office to something suitable for my work day (it can get warm fast with computers and stuff running, and the door closed for meetings), so I don’t have to over-run the office in the morning when having my first meeting: it starts automatically, and it welcomes me with a normal, working temperature for my first meeting.

The final point is that, since we can actually set a very wide range on the thermostats, and rely on much more accurate thermometers that are not restricted to the 16°C to 30°C range, we can leave the thermostat on, with a very wide range, when we’re not actually going to be at home. This is particularly interesting now that we might be able to travel again, at least to see families and friends — we don’t want to leave the air conditioning or heating running all the time, but we also want to have some safeguards against the temperature dropping or raising out of control. This became very clear when the CGG1 in our winter garden hit the 50°C ceiling it could report while we were out playing Pokémon Go and we couldn’t turn on the air conditioning at all — thankfully the worst result of that was that the body of one of the fake candles we have in the winter garden for ambiance melted… and now it looks even more realistic as a candle (it also damaged the moving flame part, but who cares.)

Now, the insulation of this flat is not great. In particular the home office tends to drop down close to freezing temperatures in the winter, because the window does not seal even when closed. This means I can’t easily follow Alec’s suggestion on energy storage. But having a bit more control on automation does make it easier to keep the temperature in the flat more stable. In the winter, I expect we’ll make sure that we keep a minimum temperature overnight to avoid having to force a much higher differential when we wake up, for instance.

There’s a few features that I have not yet implemented, and that I definitely should look into implementing soon. To start with, as soon as summer gives away to autumn, we’re likely going to want to be using the dehumidifier more — without turning on the heat pump, particularly in the living room as we cook and eat. Since the CGG1 provides humidity readings together with temperature, it means I can set up an automation that, if nothing else is running, turns the dehumidifier on if the humidity reaches a certain set point, for instance.

There’s also two switches that I have not implemented yet, but should probably do, soon. One turns on resistive heating – and this will make sense again if you watch Alec’s video on heat pumps – while the other has to do with the Plasma Filter.

What’s a plasma filter? That’s a good question, and one that I’m not sure I have the right answer for. I know that this is something that the original control panel suggests is present in our HVAC, although I have no way to know for sure (we don’t have the manual of the actual engines). The manual for the PQRCUDS0 says that «you can use the plasma function» but also states «If the product is not compatible with the Plasma function,
it will not do the Plasma function even though the indicator is turned on.» This suggests that unlike other features like the swirl/swing, it’s not part of the feature query that the panel sends at turn on.

When googling further to look for information about LG’s plasma filter, I did find another manual, for an actual unit, rather than the control panel. Not the unit we’ve got, I think, but at least an unit. And this one has a description for the plasma functionality:

Plasma filter is a technology developed by LG to get rid of microscopic contaminants in the intake air by generating a plasma of high charge electrons. This plasma kills and destroys the contaminants completely to provide clean and hygienic air.

This is quite a bit interesting — and next to this, a video from LG refers to this as a “Plasma/Ionizer”, which pretty much suggested me that this is one of Big Clive’s favourite toys: ozone generators. Which makes sense given that one of his favourites is a Sharp Plasmacluster.

Code And Next Steps

First of all, the code and the board designs are all available on my GitHub. I originally considered making this a normal component for ESPHome, but since it relies on a very custom board, it doesn’t feel like the right thing to do. It does mean I need to manually keep in mind all the various changes in APIs, but hopefully that will not take too much of my time.

As I said previously, I have not actually implemented the panel bus handler — the panel will enter into error mode if it does not get an expected reply from the HVAC engine, so connecting it right now would not work at all, except if you were to disable the actual ESP32 control. I’m likely going to leverage that behaviour to test some more error handling in the future.

I would like to put a box around the board — right now it’s literally stuck to the wall with some adhesive feet, in all three rooms. And while the fixed red LED is not too annoying overnight, it is noticeable if you wake up in the middle of the night. My original idea was to find someone who can help me 3D print a box that fits on the same posts the panel fits, and provide a similar set of posts for the original panel. But it also involved me finding a way to flip the switch without taking it down the wall.

But since figuring out 3D printing with no experience is going to take a lot of investment, I am not going to take a look at that option until I’m absolutely certain I’m not changing anything in the design. And I know I would want to change the design a little bit.

First of all, I want to have a physical cut-off for the connection — since the power to run the ESP32 module comes from the same cable controlling the HVAC, the only way to turn off the power is to disconnect the cable, right now. Having a physical switch that just disconnects the power and data would just make it easier to, say, replace the devkit module.

Similar, as I said, the panel is not that useful to keep running all the time. So instead I would change the switch implementation to keep the panel off most of the time, and only power it on when disabling the ESP32. This would also save some components, since there would be no need to have the second bus transceiver and its passives.

I’m also wondering if it would make sense to have some physical feedback and access, in addition to the Home Assistant integration and the voice assistant controls: in particular I’m considering having an RGB LED on the board to tell the current action being taken (optionally, I wouldn’t want to have that in the bedroom, as it would be way too bright) together with a button to at least turn the HVAC “soft off”.

Finally, there’s a couple of optimizations that could be done to make the board a bit cheaper. One of the capacitor is ceramic, but could be replaced with a polymer one for ⅓ of the price, and the TVS diodes pair (which were actually a legacy part of the design, recommended by the MCP2021, but not in the reference designs for the TLIN1027) could be replaced with a single integrated TVS diode — it would just be “a bit” harder to solder by hand in TO-236 packages.

These are all minor though — the main cost behind the board is actually the ESP32-DEVKIT-32D that it’s designed around. It would be much cheaper to use only the ESP32 WROOM module, and not have the USB support components on the board. But I have had bad experiences with trying to integrate that in my designs, so I’m feeling a bit sceptical of going down that route — it also would mean that a botched board sacrifices the whole module (I did sacrifice two or three of those already) unless you get very good at desoldering them (or you have a desoldering station).

So most likely it will take me a few months before I actually get to the point of trying building a 3D printed cover for it. With a bit of luck by then we’ll be back in an office at least part of the time, and I can get someone to teach me how to use the 3D printer there.

Also as a final note — the final BOM for the boards suggest that building one of them costs around £25 or so, without the case. As I said there’s a few cost saving measures that I could take for the next round — though it’s questionable, because it would require me to get more components I don’t currently have. Of course the actual cost of building three of them turned out to be… significantly higher.

I think this is the part that sometimes it’s hard to explain to people who have not had this type of experience: the BOM costs are only one of the problems you need to solve — you can really screw up a project by choosing the wrong components and bringing the BOM price too high… but a low BOM cost does not make for a cheap project to finish, particularly when you’re developing it from nowhere.

In this case, if I wanted to tally up the cost of building these custom thermostat panels, I would have to, at the very least, count the multiple orders from DigiKey from which I ordered the wrong components (like the two failed attempt at using Microchip’s LIN bus transceivers rather than the TLIN, or the discrete UARTs with their own set of passives). But then there’s the cost of having all of the various tools that were needed to get this all done. Thankfully most of those tools have been used (and sometimes abused) for different projects, but they are a good metaphor for the cost of R&D that many products need — so it makes sense that what you end up buying costs more than the “simple” expense of the BOM. So keep that in mind next time you see an open source hardware device costing more than what you expect it to.

Reverse Engineering an LG Aircon Control Panel — Low-Speed Serial Issues

In the previous part of the LG aircon reverse engineering, I gave the impression that the circuitry design was a problem mostly solved. After all, I found working LINbus transceivers, and I figured out a way to keep one of them “quiet” when not in use. And indeed, when I started drafting the post, it was supposed to be followed by a description of the protocol I identified, and should have been talking about how I wrote tools to simulate this protocol to test the implementation without actually touching the HVAC unit.

And that’s what I set out to do myself on a live stream, nearly two months ago now — the video says “Attempt 2” because on the first, short stream I ended up showing on stream my home wifi password and, even though it’s not likely that a bad actor would show up at my place to take over the wifi, it’s not good opsec, so I stopped the stream, rotated the password of all the devices at home, and came back to the stream.

So what happened? Well, while I was trying to figure out how to build an ESPHome custom component for the “climate” platform, but trying to send sequence of bytes through the serial port appeared to not work correctly: instead of being sent at the selected polling frequency, they would be “bunched up” together, sending three bytes instead of one, or twelve instead of six. It worked “fine” if I flushed the serial port, but the flush operation would then take longer than the time between the commands I wanted to send, so that didn’t sound like a good plan.

As you can imagine from the title, this particular problem only happened with the slow, 104 8n1 configuration that the LG aircon needs — it didn’t happen at all with higher baudrates such as 9600, which suggested the problem was related to the timing of the connection, which is not uncommon: a lot of UART implementations that include FIFOs tend to define some timing based on the timing of a “space” or of a full character.

What also suggested me that, is that someone, somewhere, was complaining that the ESP32 couldn’t do the slow speed that this aircon needs, and that they preferred using the ESP8266 because that one came with a software serial implementation. Unfortunately, I cannot find anymore where I read that, to link it there and to point out that the code for the ESP8266 software serial actually works without significant modifications on the ESP32 — it’s just that the lack of need for it means it’s not readily available.

So indeed, I managed to get the ESP8266 software serial to work… except for the fact that it was not quite reliable. At 104 bps (which is the speed the aircon protocol needs) sending a six bytes sequence (which is the size of an aircon packet) takes about half a second — add another half second for the response (which is also six bytes), and you have a recipe for a disaster: one second every two seconds (which is the frequency of command exchange between panel and HVAC) would be spent just on serial communication — anything else happening during those time and messing up the timing meant bad communication.

Another nearly-software-based alternative I attempted, and also kind-of worked, was using the RMT peripheral. This is the remote control peripheral included in ESP32 — and the reason why Circuit Python made it harder to send pulse trains on FeatherS2: it’s no longer just implemented in software, but it relies on hardware buffers to allow sending and receiving pulse trains. It’s awesome to offload that, but it also comes with limitations. In particular, while I did manage to implement a 104 bps serial transmission through this interface, it would only allow me one serial pair rather than two, severely limiting what I could be doing with the aircon board.

Content Warning: though I’m personally aiming at following more modern conventions, the terminology in use by datasheet and other content that I’m about to link to still use older, less inclusive terminology.

UART — But Discretely

So instead, I used my overcomplication skill to come up with an alternative: discrete UARTs! You see, back in the days when personal computers came with serial ports, and before chipsets started having everything into a single physical chip, and even before most of the functionality of basic peripherals was merged into a Super I/O chip, we had multiple competing discrete UART chips available. The most common one being the 16550, at least for my personal experience. These are still available, and you can indeed still use the 16550, although to do so you need to use a lot of I/O lines, as 16550 and compatible have usually a parallel I/O interface, which is suitable for ISA bus connection, but not so suitable for a microcontroller even when it’s quite generous with its GPIO lines.

As an alternative, I started looking into using a SC16IS741A, which is a I²C (and SPI) chip. Instead of using a lot of separate I/O lines for sending commands and data to it, you send it with the usual two wire interface, and it internally decodes it into whatever format it needs. You may wonder what the difference is, between using the actual UART and sending it over to an I²C hardware UART — the answer is a bit complicated to explain theoretically, but I think a visualization of it will go a long way to explain:

What you see here is a screenshot from the Saleae Logic software capturing three I/O lines: the I²C bus (SCL above, SDA below), and the TX line of the discrete UART. This is a very much not optimized transaction that shows the sending of one byte at the 104 bps configuration that my aircon control needs. It’s one order of magnitude slower to send the byte than to set up the UART to send the message and send it, and this is with a relatively slow bus as I²C is. And it doesn’t scale linearly, even.

Basically, the discrete UART allows offloading the whole process of keeping up with the timing, and it does so in a very efficient way. Receiving is even more interesting, because it does not require the microcontroller to pay attention to the received bytes until it’s ready to process them, and maintain them in the FIFO in the meantime. But this kind of features already exist in most microcontrollers (often referred to “hardware UART”), and when they work, that’s awesome… but clearly sometimes they don’t quite work.

This particular device would be useful on boards based around the older ESP8266 micro, as that only has a single hardware UART, and is used for the logging. With one (or more) of this or similar chips you would be able to control a much wider number of serial-controlled devices, and that makes them valuable.

Unfortunately, ESPHome does not really have a way to provide an uart bus outside of the core platform, at least right now. If I did end up working more down this particular route, I would probably have paid more attention to integrate it — it’s not hard to provide the same interface so that it would be a drop-in replacement, but it does require some reshuffling of the core hierarchy so I don’t think I can pull that out just yet.

Writing a Device Driver, a Component, a Library

In either case, whether you want to integrate this directly in ESPHome, or use it from another software stack, you need to implement its own command set. This is what you usually refer to as a “device driver” for operating systems such as Linux and Windows, as it provides access to the underlying device with some standard interface. And indeed, the Linux kernel has a driver for a set of related of NXP UART peripherals: sc16is7xx.

Now, while the NXP-provided datasheet has all the information needed to program for this chip, having the source reference on Linux made things significantly easier, particularly because unless you know what to look for, you most likely will misread the datasheet. I have some (though not much) experience with I²C devices, but there were a few things that ended up confusing me enough that I wasted hours down the wrong route.

The first problem was figuring out the addressing convention. I²C addresses should, by convention be 7 bit. While the protocol sends a whole byte for addressing, it uses the last bit in it to specify whether you’re issuing a read or a write. But despite this being the common convention, and the one that ESPHome, CircuitPython, and just about anything else expects you to use, some datasheet do not follow that, and provide the full 8 bit “address”. You can see more details on that on the Total Phase website, which was instrumental for me to get to the bottom of why things kept disagreeing with what I was writing.

Once the peripheral was addressed, the question to answer was about the registers addresses. In my first attempt at configuring the chip I was falling short. A quick look through the Linux sources told me that I was missing a left shift of the register address… which made me go “Huh?” for a while. Indeed, the datasheet provides explicit tables explaining the register addressing: registers have a 4-bit address, but are set in the 3:6 bits of a byte, with bit 7 (MSB) being used in SPI only to select between reading and writing to the register. Of the remaining three bits, two are used to select between channels — because some of the matching chips by NXP include more than one UART on board, though unfortunately I couldn’t find one that I could easily work with on a breadboard that did. The last one (LSB)… is not used. It’s always off and reserved. But more interestingly, the Linux driver only shifts the address by two, not by three like I had to. So I’m wondering if this does mean that this chip is only mostly compatible with the ones I was looking.

So after one full day of figuring out how to properly run my component over ESPHome, I decided I needed something for prototyping this I2C faster.

Enter Circuit Python

By now you probably know that I do like Circuit Python. And as it turns out, I already have written some code for I²C in Circuit Python when I extend the MCP230xx library to include the older ’16 model. So it didn’t feel too odd to go ahead and use a Trinket M0 with Circuit Python to play around with the UART.

The choice of the Trinket M0 over the more capable Feathers was not random: while the Trinket has a physical UART and the pins to use it, it’s also a very tiny device. The fact that you can use multiple physical UARTs through the I²C bus allows a significant expansion of the I/O abilities of that class of microcontrollers.

At the end, I not only ended up writing a CircuitPython compatible library that allowed me to use the UART, but also re-writing it to leverage the Adafruit_CircuitPython_Register library, making it significantly easier to add support for more features.

The library supports an interface that is nearly identical to the one provided by the built-in serial, although I don’t think theré sa way to make sure it really is, because similarly to ESPHome it doesn’t look like Circuit Python ever consider the need to support UARTs that are not part of the original hardware design, understandably so, as these discrete UARTs are still fairly uncommon.

But I went one step further: when I read the datasheet the first time I wasn’t sure just how strong the suggestion for 1.8432 MHz crystals was, for the divisor. Turns out it’s not strong at all, so the whole amount of crystals I bought at that frequency are not particularly helpful. Worse yet, it turns out I don’t need any clock, because even the Trinket M0 is able to create a 50% duty cycle PWM output at a frequency that is high enough to use as driving clock for the UART.

That means that I can fit, in a half sized breadboard, the whole circuitry I need, including only two passives (the pull-up resistors on the SCL/SDA lines), providing the clock as a PWM output from the Trinket while also piggybacking its reset line. This was a surprisingly good setup, and would actually allow me to control the two sides of my aircon (panel and hvac) if I was still going with the discrete UART idea.

But it turns out I really don’t need any of that: ESP32’s UARTs worked out just fine at the end — at least in the most recent firmware as uploaded by ESPHome, so I decided to set aside again the UARTs and try instead to control the aircon at least with an USB-to-UART adapter. But that’s a story for another post.

Bonus Chapter: Dual-UART chips

As I said earlier, there are some options out there for multiple UART chips that would be interesting to use for cases like mine, in which I need two independent, yet identically configured UARTs. Dual UART chips are not uncommon, but I²C controlled ones are.

If you look around for I²C Dual-UART options, you most ilkely will end up on the DFRobot DFR0627 which is an “IIC to Dual UART Module” — IIC being the name you’ll find used on Chinese products to refer to I²C (it’s like TF card instead of SD card, don’t ask me.)

So why did I not even consider this particular option? Well, the first issue is that this is a full module, that uses the Gravity connector (which is similar to, but as far as I know not compatible with, the Stemma QT connector that Adafruit uses), the second issue is that there’s no documentation to go with how to use it.

Since I want to, at the end of the whole process, have a printed board I can just hang on the wall (possibly with a 3D Printed case, but that’s further along the way), I need to be able to get the components I want in retail-enough options that I can buy them and solder them in myself. I also need to be able to control those components with arbitrary software.

The DFRobot modules tend to have Arduino components, which you may still be able to use for ESPHome, but you wouldn’t be able to use with Circuit python during the more iterative side of the project. Since these components are open source you could go ahead and reverse engineer it from those, but it would be much easier to develop for something that has a datasheet and some documentation.

Indeed, the DFRobot website does not even tell you what chip is on the module. though if you look around in the forums you can find a reference to WEIKAI WK2132-ISSG, which is available through LCSC and comes with a datasheet. In Chinese.

If you just look at the pictures, you can at least confirm that this device is similar in functionality to what the NXP part I’ve been working with provides, except for the fact that it does not have the full RS-232 style CTS/RTS lines. So it really would be an interesting part if I ever decided to go back to the idea of using discrete UARTs, but it would require at the very least for me to get one of my Chinese-reading friends to translate enough of this 28 pages datasheet to be able to tell what to do. That is unlikely.

Extending the Saleae Logic

One of the reasons why I have always appreciated FLOSS is the ability to adapt, modify, and reuse tools and code and designs to build just what you need. Over the years I have done that many, many times, including writing ruby-elf to run my analysis and taking over unpaper to handle the scanned document.

What I have not, until lately, realized, is the usefulness of building your own physical tools, or rather I have figured that out by watching Adam Savage videos on YouTube, but I didn’t actually build physical tools until pretty much last year, when I started the Birch Books project.

Now, even though Saleae Logic Pro is not open source, neither in its hardware nor its software, I was recommended it many years ago, and I think it’s one of my best purchases in a long time, because it’s very compact, the software works great on Linux and Windows (which is handy because my Windows machine always had a lot more RAM), and the support folks are awesome at addressing issues (I reported a few wishlists, and a couple of bugs, and I think the only one that took a long time was a licensing issue). And since supporting and maintaining those tools is not my main hobby, I’m okay with accepting some closed source tools if they allow me to build more open source.

(I have also ordered a Glasgow, obviously – I mean, Hector working on it made it kind of an obvious thing for me to do – but I have had the Logic Pro for multiple years and the Glasgow is expected to deliver next year, so…)

Tip-Ring-Ring-Sleeve Man-In-The-Middle

So anyway, last year I started looking at ways I could make my life easier, and since I was already getting stuff printed at JLCPCB (not a sponsor), I thought I would try to at least build one of the devices I needed: a board that would let me pass through two TRRS connectors while “copying” the signals to the Saleae. The end result is in the following pictures (two different versions of the same concept — originally I made the board a bit too long, and the connector didn’t fit quite right).

As you can see from the image on the right (or the bottom on mobile), the main use I have for this is to be able to observe the transmission between computers and glucometers, since that has been my main target for reverse engineering over the years, and very often they are connected via a TRS 2.5mm or 3.5mm serial plug, similar to how Osmocom has introduced the world of open hardware many years ago.

The board is simplistic, but also provides a few features to make my life easier. First of all, it uses TRRS connectors, because they are compatible with the good old TRS connectors, but also support more modern ones. Secondly, it’s only 3.5mm plugs on purpose: finding 2.5mm cables is annoying, but finding 3.5 to 2.5 and vice versa is easy, and that’s why I use it with those cables at the end.

Finally, killer feature for me is that switch, which selects between CTIA and OMTP ordering of the connectors. If you’re not aware of it, over time two separate standards have been in use for wiring four conductor TRRS cables, with one of them often incorrectly referred to as “Android” or “Samsung” and the other referred to as “Apple”. The CTIA/OMTP naming is the correct naming, and basically what this does is just changing which of the two pins of the connectors is provided as ground to the Saleae.

Oh yeah and eventually, I released it. I did that under 0BSD, because it’s a really obvious design and if someone wants to reuse it, I’m only happy. I have considered whether this should be released under the CERN Open Hardware License and I can’t imagine why, but if you want to make an argument for it I’m likely going to be swayed by it, feel free.

I also originally drawn plans for a USB-to-Serial adapter using CP2104 — in two versions, one that would just be a simpler USB-to-UART with no level shifting, and one that would be a full-blown level shifting RS-232 port. The latter was something that came to me as an idea after reading a lot Foone on Twitter, to the point that I sent them a some of the untested boards (because I got stuck with the move and stuff, so it took me months to get back to them!)

Unfortunately, something didn’t work out quite right. The serial adapter didn’t work out at all in my case, and the RS-232 one keeps browning out, probably because I skimped on the capacitors. I would be ready to get a respin to try this again, but the current chip shortage does not allow me to make more orders for SMT boards with that particular chip in it.

Serial Adapters Harnesses

Not quite related to the Saleae, but related to the recent work on my aircon reverse engineering, was starting to think about labelled wire harnesses for serial adapters. And thanks to Meowy reply I went down the rabbit hole looking at professional label makers.

Now, let me be clear, I have had a label maker for quite a long time. I have blogged about it in 2011, when even the simplest of the label makers (Dymo LetraTag) was to be considered a risky investment (that’s what happens when you’re self-employed, coming from a blue-collar family, and in the backwaters of everything — one day I need to write more about that). I still use that label maker, pretty much our whole spice cabinet depends on it.

But there’s a different class of label makers, namely electricians’ label makers, such as the one showed off in the Big Clive video on the side. Of all the features that he shows off on that video, though, it was missing showing off prints on heatshrink, and how you can use those to make those labelled harnesses as shown in the tweet.

So, while the original price I could see (£110) was a bit annoying to invest for something I wasn’t sure I would be using very often, following Clive’s advice of waiting for Toolstation having it at a discount (which it did, at £69) was worth it — well worth it for me. Though to be honest, I got some off-brand heatshrink cartridges.

So the first thing I did with these have been creating a wire harness for one of the USB serial adapters that I have been working on the aircon with:

For this particular adapter, I also used a 1×5 Dupont plastic block — I did not crimp the cables, they come from a box of male-to-female cables that I bought off AliExpress some time ago, though you can refer to Big Clive on crimping, too. One of the things he shows in this last video, is that you can just break the plastic tooth to release the metal Dupont connector. Which is what I did: I broke off the plastic teeth, and re-fit the already crimped cable into the 1×5 housing.

I could have just prepared my own cables, to be honest. But to be honest, if I wanted to do that I would probably grab some good quality soft cables like the one that comes with the Saleae Logic, but I don’t have any of that stuff at home.

The other important part with this is that I used male-to-female on the cables, because these are the cables I need right this moment, because I’m working on the air conditioning reverse engineering, and that means I need to connect the serial adapter straight into the breadboard.

So About That Breadboard

You may have noticed one thing I said: I’m currently working on the aircon reverse engineering with breadboards, so having the cables end with a male connector helps. If you’re familiar with the Saleae Logic, you would know that the harnesses it comes with are 2×4 dupont cables with a female end to connect to boards. Which works great when you’re connecting to I/O pins on a board, or using the provided probes — but not really if you’re doing work on a breadboard.

So in the same vein as I did for the serial adapter above, I also decided to go ahead and make my own set of additional harnesses. They are once again not perfect, as the cables are the same AliExpress cables I used for the other harness, but they will do the job for the time being.

I have kept the colours the same – or at least the closet that I could get to with the AliExpress cables – because the Logic software uses the colours to represent the channels. And that means I don’t need to worry anymore of matching colours with colours all over the place. On the other hand, since working with a breadboard means I don’t have that many different ground positions, I decided to only put one ground wire per block. This is also because I ran out of black cables in the pile of AliExpress ones (I should order more).

This turned out to be extremely useful. Being able to just grab the cables and plug them straight into the breadboard on one side, and the Logic Pro on the other made has been a huge timesaver during my work, both off-camera and during streams.

A Note On Board Design

After this experience, I also want to share some of the considerations I made my mind up on, after these trials and errors. The first is that it’s very hard to find more than 1×8 Dupont connectors. That means that in many, many cases it’s much easier to go ahead and use 2×5 connectors if you have multiple pins that need to be connected together.

It also makes sense to space them to “key” them as a single connector, rather than using multiple, out of line pin lines, which I have done in the past. Indeed, in the custom CP2102 that I spoke about earlier, I wouldn’t be able to have a single harness, and would rather need two. That was a bad idea for a design, and when I’m going to re-do it (because I am going to re-do it), I’ll make sure the pins are arranged 2×4 instead, so that it can be connected with a custom labelled harness… or one of the harnesses that comes with the Logic!

This is the kind of important and useful notes that I like reading, finding in videos, and I wish would be collated in more practical material for wannabe PCB designers. It definitely carries over on my current designs of the air conditioning control board.

Customizing The Software

In addition to the custom hardware dongles I’ve been playing with, I also started looking at using more advanced software for the analysis.

In the Software Defined Remote Control repository, I already had a binary that would be able to receive a CSV exported by Logic and interpret it. Unfortunately for this to be easily parsed from within Logic, I would need to write a C++ extension for it.

On the other hand, for the air conditioning, I just needed to write a Python High-Level Analyzer, which provides me with at least a bit of the understood meaning of the various bytes in the packets.

I hope that as time goes on, and I find myself reverse engineering different hardware, I will be able to build up a good library of various analyzers — hopefully sharing enough code between them. Which is something I definitely need to engage with Saleae on: it does seem like right now you cannot depend on external Python modules even from HLAs, but it would make sense to be able to use libraries like construct, or even higher level libraries for things like my aircon or some of the glucometers I have worked on before.

Reverse Engineering an LG Aircon Control Panel — Buses and Cars

This is part two of my tale of reverse engineering the air conditioning control panel in our apartment. See the first part for further details.

If you are binging on retrocomputing videos like I’ve been doing myself, you may have the wrong impression that a bus has to have multiple lines, like the ISA and PCI buses do. But the truth is that a single-wire bus is not unheard of or even uncommon. It just means that the communication needs to be defined in such a way that there’s no confusing as for who is sending at any given time. In this case, it’s clear that the control panel is sending six bytes which are immediately (and I do mean immediately) followed by six bytes response from the HVAC.

So the next step was to figure out what those six bytes where, and thanks to Saleae’s recent licensing of sample high-level analyzers, this became a piece of cake. While I’m not at liberty to share the code, at the time of writing, I ended up writing an analyzer that would frame together the 6 bytes from the panel and the 6 bytes from the HVAC. Once I had that, it was also easier to notice that the checksum byte was indeed the same as other LG protocols, it’s just that it applied separately to the two 6 bytes packets, which means there’s only five bytes in the message that need to be decoded.

A screenshot of the Logic 2 software showing an analyzed trace with the high level analyzer loaded.

With a bit of trial and error, I already decoded what I think will give me most of the important controls for my plan: how to change the mode between the aircon, heat pump, fan, dehumidifier, and how to change the fan speed. The funniest part is that the “Auto” mode is actually not a mode at all, and just means that the thermostat appears to be sending the “aircon” or “heat pump” as needed.

What got even more interesting, is that if you leave the control panel by itself, after a few minutes it appears to notice the lack of an HVAC connected, and goes into an error state where it alternates the display between “Ch” and “3”. Either it’s reporting its own channel for diagnostics (assuming it’s misconfigured) or it’s just showing a particular error status. In either case, that threw a spanner in my plans.

The first problem is that obviously you wouldn’t be able to connect the 12V data wire to the ESP32 directly. That’s kind of obvious: the ESP32 is a 3.3V microcontroller, and if you tried to use a 12V wire with it it’ll just… go. My original intent was to use two optocouplers: one to receive the data from the control panel, and the other to inject my messages onto the wire. But that won’t work quite the same way for a bus, and while I could try to build up the right circuitry with discrete components, I would have rather used a ready-made transceiver.

The problem with that the transceivers are made for specific buses, and so the first question is to find the right bus that is by LG. A lot of HVAC systems (particularly in industrial scale) use Modbus over RS-485 — I have experience with this since the second company I ever worked for is a multinational that works in the industrial HVAC sector, so I learnt quite a bit of how those fit together. But an RS-485 connection would require two wires, since it uses differential signaling, and that’s already excluded.

Going pretty much by Google searches, I finally nailed down something useful. In the automotive industry, there’s a number of standards for on-board diagnostics (OBD). The possibly most famous (and nowadays most common) of those is the CAN bus, which is widely used outside that one industry, as well. LG is not using that. But one of the other protocols used is ISO 9141-2, which includes a K-Line bus on it, which according to Wikipedia is an asynchronous serial connection over a single bidirectional wire without handshakes — though it is using a 10.4kBd signal which is… exactly 100 times faster than the LG signal.

Through these, I found out about the LIN (Local Interconnect Network) bus, which is also used in automotive, specifies a higher level implementation on top of ISO 9141 compatible electrical signaling, but happens to be a good position to start the work with. Indeed, there are a number of LIN bus transceivers that are pretty oblivious of the addressing and framing on the protocol — on purpose, because the specifications have changed over the years. But what they are good for, is to connect to a 12V, high recessive bus, and provide microcontroller-leveled RX and TX signals.

An example of these transceivers is Microchip’s MCP2003, so I decided to set myself up to redesign the board based on that. But since the control panel also needed to receive “acknowledgements” from the HVAC, it meant that each “smart controller” needs two transceivers: one where it fakes the controller to the HVAC, and another one where it fakes the HVAC to the controller. And both of those needed to have the ability to just go into a “lurking” state where they wouldn’t be sending signals if I flipped a physical switch.

Screw It, I’m Doing It Live

So here’s where things got a bit more interesting in multiple directions. In the days just before this work, I was being asked a few pointers about reverse engineering — and unfortunately I don’t know how to “teach” RE, but I can at least “go through the motion”. After all, that was the more interesting part of my Cats Protection streaming week, so once the DigiKey order arrived with the transceivers and all the various passives to add around it, I decided to set up a camera, and try breadboarding the basic circuitry.

Now, setting aside the fact that I do not particularly enjoy streaming with an actual camera, and indeed the end results left a lot to be desired, the two hours stream was fairly productive. I found that the PL2303 USB-to-serial adapters actually work quite well at both 100 and 104 Bps, and that indeed the transceiver mostly works fine.

It also showed an interesting effect that I did not expect: as I said earlier, after a few minutes without getting an answer from the HVAC, the control panel enters into an error state (Ch/3). I assumed that what it needed was a valid packet from the HVAC, with checksum and information. Instead, it seems like just filling up a buffer, even with invalid packets, is enough to keep the control panel working: as I typed random words onto the serial port, while connected to the bus, the Ch/3 error vanished, and the panel went back to a working state.

This was surprising for one more reason: at least some of the packets sent from the HVAC to the panel had to include the capabilities the HVAC system has to begin with. The reason why I knew that is that the control panel appears to have a lot more functions when it’s running standalone, compared to when it’s installed on the wall. Things like a “power” fan mode for the aircon, the swiveling ventilation, and so on.

Spoiler: it turned out to indeed be the case: the first two commands sent from the panel to the HVAC appear to be some sort of inquiry, that provide some state to the panel to know which features are supported, including the heat-pump mode and the different fan speeds. But for now, let’s move on.

Before I could go and and try to figure out which bit related to which capability I hit a snag, which is what I got stuck at the end of the stream there: sending the character ‘H’ on the serial port (a very random character that just happens to be the start of the string “Hello, world!”) showed me something was… not quite right.

A screenshot from Logic2 showing a 0x68 sequence being interpreted as 0x6C.

This is not easy to see, beside for the actual value changing, but in the image above the first row (Channel 0) is the 12V bus (which you can read on the fourth line is actually 10V), the second and fifth rows (Channel 4) are a probe connected to the RXD pin of the MCP2003, and the third and sixth (Channel 5) are a probe connected to the TXD pin (which is in turn connected to the TXD of the USB-to-serial adapter).

Visibly, the problem is that somehow the bus went from “dominant” (0V) to “recessive” (12 10V) too fast, making the second and third bits look like 1s instead of 0s. But why? My first thought was that it was an electrical characteristic I missed – I did skimp on capacitors and diodes on my breadboarding – but after the stream terminated, I grabbed my Boox, and checked the datasheet more carefully and…

1.5.5.1 TXD Dominant Time-out

If TXD is driven low for longer than approximately 25 ms, the LBUS pin is switched to Recessive mode and the part enters TOFF mode. This is to prevent the LIN node from permanently driving the LIN bus dominant. The transmitter is reenabled on the TXD rising edge.

MCP2003/4/3A/4A Datasheet, DS20002230G, page 10

25ms is nearly exactly how long the dip to dominant state is on Channel 0 (and about the same on Channel 4): it’s also nearly exactly 2.5 baud.

A Note About Baudrate

I have complained loudly before of how I’m annoyed at people who think those younger than them know nothing and should just be made fun of. I don’t believe in that, and I think we should try our best to explain the more “antique” knowledge when we have a chance.

Folks who have been doing computers and modems well before me appear to love teasing people about the difference between “baudrate” and “bits per second”. The short version of that is that the baud rate relates to the speed of sending a single impulse, while the bits per second (bps) is (usually, but not always) meant to be taking the speed of the actual data transmitted. The relation between the two is usually fixed per protocol, and depends on how you send those bits.

In a asynchronous serial protocol (including RS-232 and this LG abomination), you define how you send your bits with an expression such as “8n1” or “7-odd-2” (also called the framing parameters) — or a number of other similar expression with different values in them. These indicate that each character sent is respectively eight or seven bits in size, that the parity is not present in the first case, and is odd in the latter, and that the first includes only one stop bit while the second is providing two. In addition to this, there’s always a single start bit.

8n1 is probably the most common of the framings, and that means you’re actually sending 10 bits for each character. A baudrate of 9600 Bd/s gives you a 960 bps raw connection, the 104 value for LG is the actual baudrate, as I can measure one of the impulses from the original control panel at 9.745ms — which actually would put it around 103 Bd/s.

Which is where my assertion that 25ms is nearly exactly 2.5 baud — 2.65 to be a bit more precise: you take the length (25) and divide it for the time needed to send a single baud (0.9745).

What this means in practicality is that the MCP2003 series (including the more modern MCP2003B that includes the same time-out behaviour) has a minimum baud rate as well as a maximum one. The maximum one is documented in the datasheet as 20 Kb/s, but the minimum is affected by this timeout: a frame of all zeros would be the worst case scenario in this condition, as the line would be asserted low (“dominant”) for the longest time. While theoretically you can define framings the way you prefer, the common configurations vary between 5 and 9 data bits per frame (though I would have no clue how to process the 9 bits per frame to be honest!) — which means that the maximum number of space (‘0’) baud would vary between 6 and 11.

Why six and eleven? Well, the “start” baud is also a space (logical zero) – which means that if your framing is 5n1, the 0x00 value would be sent with six “spaces”. And if you use nine data bits per frame with even parity, 0x000 would then be followed by a “space” in parity (to maintain the number of ‘1’ bits even), bringing it up to 11 (start, nine zeros, and parity).

The minimum baudrate for a certain framing configuration is thus calculated by dividing the maximum number of consecutive spaces the timeout in seconds (0.025), which leads to a minimum baudrate of 240 Bd/s for when using 5n1, 440 Bd/s for 9e1, and 360 Bd/s for the most commonly used 8n1 framing. Which is over three times faster than what these LG units are using.

I Need A New Bus Transceiver

Since I couldn’t use the MCP2003, I ordered a few MCP2021. Note that Microchip also says that these are not recommended in new designs, suggesting instead the ATA663232 — which as I’ll get to has all of the disadvantages of all the various options for LIN bus transceivers.

When I received the meter, I decided to take another stab at streaming setting up the emulator on camera:

If you watch the whole video you will see me at some point put a finger on the chip and yelp — turns out I ended up with a near-dead short on its embedded regulator. Thankfully, since the chip is designed for the automotive market, the stress did not cause it to fail at all, just… overheat. And as I showed on stream, I did manage to keep the control panel running with my “emulator”, although I did note some noise on the I/O towards the end.

So a little bit more exploration later told me that a) the PL2303 seems to be a bit unreliable with the 3.3V without tying the VREG with the 3.3V coming from the device, and b) even on the CH341 I would get some strange noise in addition to the signal. I think the reason for that is that the chip uses a comparator against its own regulator to decide whether the transmitter should be on. Since, as Monty and Hector suggested, it’s a bad idea to tie multiple regulators together, I decided that even the MCP2021 is not the transceiver I wanted.

Unfortunately, that made it harder to find the right transceiver. Microchip’s suggested replacement, the ATA6632xx series, has all of the disadvantages, as I said: it has the “TXD Dominant Timeout” feature (so it cannot send the 104bps signal I need to send), it includes a voltage regulator that cannot be disabled, and it is only available in VDFN package that is not possible to hand-solder.

On Digi-Key (which is by now my usual supplier), Microchip’s MCP20xx series are the only PDIP-8 through-hole components, so the next best thing is SOIC-8, which is surface mount (so not easily breadboardable) but still hand-solderable (with a steady hand, a magnifying glass, and a iron tip). Looking at those, I found at least two that fit.

ON Semiconductor’s NCV7327 was a very obvious choice because they explicitly say in the features list «Transmission Rate up to 20 kbps (No low limit due to absence of TxD Timeout function)», and it was the only one that I found explicitly note that the TxD Timeout imposes a floor to the speed (as I explained above). Unfortunately, the SOIC-8 version was not available at the time of order on Digi-Key, with a 22 weeks backorder.

So instead, I settled for Texas Instrument’s TLIN1027DRQ1. This is pretty much… the same. For what I can see, both ON’s and TI’s SOIC-8 devices are pin compatible, and they are nearly pin compatible with Microchip’s SOIC-8 variants, insofar as the power, bus, RXD, and TXD pins are in the same position.

There is, though, a rake just waiting for you there. The Enable/Chip Select pins on both the TLIN1027DRQ1 and the NCV7327 do not correspond to the MCP20xx Transmission Enable semantics, despite sharing the same position. With the MCP20xx you could leave a transceiver connected to a chatty bus, with the TXEnable off, and you would still receive the traffic from the bus.

But with the other two, you’re turning off the whole transceiver at once, which wouldn’t be too bad if it wasn’t that both of these pull TXD to ground (dominant), if you leave it unconnected. Again, this isn’t a big problem in by itself, as long as the firmware is told not to transmit when the bus is connected directly between the panel and the HVAC, nothing should be transmitted, right?

But this does break one assumption I was making: if I disable the smart controller board, I want to be able to remove the ESP32 devkit altogether. This is important because beside OTA (Over The Air) updates, I would need to be able to disconnect the ESP32 to update the firmware on it. Which means I don’t want to rely on the firmware being running and not holding the bus busy.

A schematic diagram of the Panel-side bus transceiver block.

So what I ended up adding to the design is a way for the bus selector to decide whether transmission is to be allowed on the transceiver. I think this is the first time I even consider the idea of using a 74-logic component in my designs (to the point that I had to figure out how to use that with the EAGLE-provided symbols — hint: use the invoke command), but this seemed to me as the easiest option to implement what I needed.

The tie-up-both-inputs for the NAND is literal textbook electronics, but turns out to work very well since the cheapest 74 logic NAND chip I found contains four of them, and I only need one other.

Note that of course this is only one of the “logical blocks” of the board — and actually not even the final form of it. As I get into more details later, you’ll find out that this only turned out to be one of the possible solutions, and (at the time of writing) there’s no guarantee that this is actually going to be the one I’m going to be using.

Service Providers, Business Sustainability, And Society

I’m not sure how many people think consciously about the business plans of the service providers that they start using, or at least relying upon. I don’t do it terribly often, but I do sometimes, and I thought I would share some words on the topic, because I do think that the world would be a better place if we did think about the effect of our choices to the wide world.

I have for instance wondered aloud, over the years, on Twitter and even on this blog, about the fintech company Curve. Their services are actually interesting and valid… but they do not offer any interesting value at a premium — I would never pay for their “Curve Metal” tiers, because it just doesn’t add up. I couldn’t just figure out how they were expecting to keep running, given I expect the majority of their “customers” are consumer end-users that would follow the same procedure I followed: sign-up for the service, using the £5 offer, use it for the 90 days of free cashback, possibly use it a time or two while traveling, and otherwise just keeping it as a backup card. Before the pandemic I also wrote how they were giving out more free money, but more recently they decided to start crowdfunding. And they became very loud when they did. Which is what reminded me I had a half-drafted post on the topic (this post), which I should probably resurrect (which I did, since you’re reading this now).

Before going back to Curve and their crowdfunding, I want to point at two sayings that you’ll keep finding around you, when you discuss businesses, products, Internet, and privacy: «If it sounds too good to be true, it probably is» and «If you are not paying for it, you’re not the customer, you’re the product». These are good places to start a discussion, although I think there is a lot of nuance that is lost when trying to (over) simplify with these.

Full disclosure: I work for a big company that, primarily, offers services to end users free with ads, though the product I personally work on not only does not relate to ads, but does not even have ads in it. And while my previous employer was another big company that is part of the “AdTech” business (and in there I did work on ads systems for a long while), I have discussed ads in the past, and I even ran ads on this blog (back in the days before stable employment), so you can imagine that what I’m going to be writing about is my personal opinion and does not represent that of my current, past, or future employers.

So why do I think it’s important to figure out the sustainability of service providers? Well, because it becomes a problem for the whole of society when a fraudulent or scammy provider gets a certain about of market share, even if sometimes not evenly and not in a way that most people would be able to connect together. For instance you can take the examples of Enron, Bernie Madoff, Theranos, or Wirecard — organizations that promised too-good-to-be-true services and profits, and ended up bust, with different blast radiuses. For the last one we don’t even quite know yet what the blast radius will be once things settle down: the German financial services environment is likely going to be reshaped by last year’s scandal, and so it appears will be EY.

Though this is clearly not limited to financial services (VPN providers seem to be pretty much in the same position), it does look like the likes of Curve and Revolut are easily the most visible cases where a company apparently lacked a sustainable business plan, and decided to turn to a crowdfunding campaign — Curve just had one this past May, and it was so noisy that I know a couple of people who went on to find a way to delete their account (and the app) simply because they got tired of their pushy notifications (not a typo).

Now, that might sound not too bad — after all, crowdfunding for the most part just means someone is willingly going to pay to subsidise other people’s “free money”. But the next thing that Curve did after that was to increase referral bonuses for new users to £20, from the previous £5, and that smells to me even more fishy — because that sounds like trying to bring in a mass of users hoping that enough of them can’t figure out that the premium options of Curve are not worth the money.

On the other side of the tracks, Revolut has been pushing more and more for cryptocurrencies, which I’m not going to even pretend is a neutral thing. I care enough about the environment that their consumption alone makes me angry, but even more so, I find that the amount of scams related to cryptocurrencies at this point are wide enough to show that the whole concept is hostile to society. I do not support nor recommend Revolut to new users unless they live in countries like Ireland where there is no other option – in London, using Revolut feels like subsidizing scammers by lending respectability to cryptocurrencies.

But at this point, neither of those appear to have reached the full consumer scams, so it should be fine, right?

Well, let’s take a different example with the VPN market. I have complained over on Twitter a few times how I blame us geeks for the amount of VPN scams that are out there. Privacy maximalists tend to scare people with the idea that your ISP, the Starbucks, or the airport lounge you’re using can see everything you do — and while it is definitely the case that there’s a lot more data going around than you may think, ads such as those ExpressVPN pays the otherwise excellent No Such Thing As A Fish podcast to air, that suggests that your ISP would be able to tell what you’re Googling are not just falsehoods, but proper FUD.

But even accepting that ExpressVPN has no ulterior motive and are totally legit – I have no idea about that, I only used them before while in China – and leaving aside the fact that VPNs have huge targets painted on their backs, there’s still the matter that you need to trust your VPN provider. Which may or may not be more trustworthy than your home ISP — the two of of them having pretty much the same power. Most of the review websites seem to be talking more about commissions than trust, because the worst part is that there is nothing that allows you to verify their statement that they are not logging your traffic in the same way they keep insisting your ISP is doing.

So how do you trust a VPN provider? Well, for a start you may want to start considering who their founders are and how they get their money. And you’d be surprised how many dots you can connect this way. For instance, last year CNET wrote about Kape Technologies, a company that bought a Romanian VPN service called CyberGhots. In that article, they also noted that in addition to CyberGhost, the same parent company bought two more VPNs:

After buying CyberGhost, Kape then bought VPN ZenMate in 2018 and more recently Private Internet Access, a US-based VPN, in a move which Erlichman said in a press release would allow Kape to “aggressively expand our footprint in North America.”

Now, the problem is less about a single parent company owning multiple VPN services as they were different brands — this happen all the time in many other fields. Just look at the relationship between Tesco Mobile and O2, or banks such as Halifax and Lloyds. But the rest of the article does make for a good build up for why the whole situation is a bit suspect.

But more importantly, you may have heard of Private Internet Access before — they are the company that started heavily sponsoring Freenode a few years ago. And if you have been paying attention to Free Software projects’ communication in the past few months, you probably know by now that Freenode is a trash fire now. So given those connections, would you trust anything that has connections to these organizations and people? I clearly wouldn’t.

This same problem with trust and business sense applies to other businesses. With the exception of B Corporations, most companies out there are intended to make money. If nothing else, they need to make money so that they can pay the wages of the people working there. So I don’t generally trust companies that appear to be giving everything away — and rather prefer those that, if they are making money with ads, say so out right.

In the case of Fintech services — Wise (formerly known as TransferWise) is my example of choice for a company that is transparent when it comes to the cost associated with their services, and makes a good case for why they charge you, and how much so. I really wish more of them did the same because it would make it easier for people to choose how much trust to put in a company. Unfortunately it appears that the current trend in the market is to push as much grown as possible for companies to grab a captive audience before turning on the monetization screw.

Important note: this blog post was written before Wise announced they intend to go public (it was previously rumored, but I didn’t spot that). I guess I should now disclose that I will most likely consider buying some stock of the company, though probably not on the IPO day. We shall see. As I said, I do like their business sense.

Going back to a moment to that «if you’re not paying for it, you’re the product» as well — well, I don’t agree in full, but this is something that people do need to be keyed in to look out for. In particular, I don’t think that ad-supported businesses should disappear, and that everything should be hidden behind a paywall, because I do think that having wider access to information without making it costly is a good thing. But also I do think that there are services that are often crossing the line into being creepily interested in your data rather than “trade it” for useful information.

But I also think the scrutiny is often placed more on the big, established companies rather than the “scrappy” startups, or the more consulting-like companies. Heck, a few of you reading this are probably already ready to complain that both my current and past employers are seen as data hungry — but I can tell you that both companies, at least during my tenure, would never have someone state on a stage that collecting data from IoT sensors and just throwing it to a ML pipeline to gather unexpected insights, as it would go against every one of the privacy and data handling trainings and commitments…

And yet John Roese from Dell EMC stated that in his opening thoughts for LISA 16 (go to minute 44 in the open access video) in what sounds terribly like an advice to startups. To be honest, that’s not the only cringey thing in that opening talk — from a technical point of view, his insisting that persistent memory means you can’t just reboot a computer to reset the state of memory (as if re-loading the data in memory from scratch wouldn’t happen on request, whether this is persistent or not) is probably a worst phrase.

What I’m trying to say is that you need to be sure who your friends are, and it’s not as easy as to expect that all small players are ethical and all big ones are not. And asking yourself “how are they making money, if at all?” is not just allowed — it should sometimes be considered a necessity.

Reverse Engineering an LG Aircon Control Panel — Introduction

I like reverse engineering stuff. It’s not just the fact that it’s a nice puzzle to solve, but I enjoy the thrill of “Oh, that’s how that works.” I’m sure I’m not alone, as can be clearly seen following marcan’s Asahi Linux work, or following Foone on Twitter, or Big Clive on YouTube (and many, many others).

Sometimes, a lot more rarely, my reverse engineering is actually geared towards something I want to make use of, rather than just for the sake of finding answers — this is one of those cases. If you have been following me on Twitter or decided to watch me work on this live on Twitch, you probably already know what I’m talking about. If not, be warned that this is going to be the first part of a (possibly long) series of posts on the same topic. It turned out to be very long for a single post, and I decided to split it instead.

You see, when we moved from the last apartment, we sold our Nest smart thermostat to a friend. The new apartment has an aircon system with heat pump, rather than a “classic” heating system, which is really important as the balcony can easily reach 40°C in the mornings when the sun shines. And unlike in the US, where thermostats are pretty much standardized, Europe’s landscape of thermostats is different enough that Nest gave up, and does not support aircon systems.

Aside: I do have a bit of a rant about Nest Thermostats in Europe, but some of that might be a bit tricky to phrase for me without risking breaching confidentiality with my previous employer, which I don’t want to do. So I will leave a question here for European Nest Thermostats users: can you finally enable hot water boost with the Google Home app?

To be honest, this also kind of makes sense: in a flat that is cooled and heated with an HVAC, it makes sense to have multiple thermostats so that each room can set a different required temperature. If we’re spending the evening in the living room, what’s the point of heating up the bedroom? If I’m on vacation and not spending time in the office, why would I turn on the air conditioning? And so on.

Unfortunately what we ended up with is three thermostat units from LG, model number LG-PQRCUDS0 (provided for ease of searching and finding this blog post), which are definitely not smart, and also not convenient. These are wired, non-smart control panels, that do support features like timing, but do not provide any way to control without tapping on the screen. As far as I know, these are configured to read a temperature sensor that is not on the panel itself, but on the other hand, the placement of those sensors are a bit on the unfortunate side: in particular in the bedroom it appears located in a position that is too natural to fit a wardrobe in, making it register always a higher temperature that the room actually has.

This had been particularly annoying during the winter but it was proving to be worse during the summer: as I said the temperature in the balcony can reach 40°C in the morning, as we’re facing east and it’s a all-glass external wall. That means that the temperature inside the apartment can easily reach 30°C quite suddenly. This is not good for electronics already, but it’s doubly non-good for things like food and medicine, including insulin, which I very much depend on.

While we could just try leveraging the timer mode to turn on the AC in the morning, the complication of where the sensor is makes it very hard to judge the temperature to set it at. And since, as Alec points out on the video, the thermostat’s job is only to turn something on or off (in theory, at least)… well, there has to be an easier way.

So I embarked in this quest of reverse engineering my aircon control panel, with the intent of introducing an ESPHome-compatible add-in that would allow me to control the HVAC through Home Assistant.

Inspection

The first thing to do when setting off to reverse engineer something is to figure out what it is, whether there is any documentation for it, and whether someone else already reverse engineered it. The model number, as I said, is LG-PQRCUDS0 and LG has user and installation manuals online describing it a Delux Wired Remote Controller (together with the -B and -S variants of the part number).

Reverse image search for the panel actually seemed to struck gold at first, as this Instructables post showed exactly the same UI as mine, and included a lot of information about the protocol. But also the comments pointed to a couple of different models that seemed all similar but a bit different. So instead of going ahead and trying to build the already reversed protocol I wanted to confirm how it all worked myself.

A close up of the door behind my LG aircon control panel showing a JST ZR connector, and a yellow-red-black cable going to the wall.

The first question is going to be what the electrical “protocol” it’s using. The back of the panel has a door, that hides the inbound connection from “the wall” (that is, the actual HVAC unit), which is three wires and terminates in a JST ZR connector.

With my multimeter I could confirm that the voltage would be around 12V — but I couldn’t confirm whether it would be differential data or what else, since I’m still using an older multimeter and it doesn’t have any option to indicate there’s a signal on a wire. If someone has a good suggestion for a multimeter that does that, please leave a comment below the video in this post as I’d love to get a good one.

Now this is a good news, overall. The fact that the plug, and the cable itself, can be bought off the shelf means I don’t have have to take risky approaches, which is great, given that we’re renting, so any reverse engineering and replacement implementation needed to be non-destructive and reversible.

So I took out my Logic Pro, a very long USB 3.0 cable, and I ordered just enough components from Digikey to debug this thing. And a bench power supply — because I didn’t have a bench power supply, and given this thing needed 12V, it sounded something handy to have for this. The end result is the following:

With this connected, I used the Logic 2 software to check the voltage levels, and figure out that the yellow wire is data, while the red wire (in the middle) is 12V supply. The data turned out to, indeed, be a 104 Bd serial connection, which would make it share a lot of the information from the previous reverse engineering…

Except that something was off: what I could see on the wire was a burst of 12 bytes in a single stream, exactly once a second, which I assumed at that point to be unidirectional from the panel to the HVAC. But when trying to verify the checksum it didn’t match what the instructions on the other project suggested: sum everything, modulo 256, and xor with 0x55 (the confusing ‘U’ in the various descriptions is actually a bit pattern). So while I could figure out that the first byte seemed to include the mode of operation, and the third one appeared to include the fan speed, I couldn’t figure out for the life of me the checksum, so I thought I wouldn’t be able to send commands to the HVAC to do what I wanted.

On the other hand, in the worst case scenario I could have just replayed the commands I could record from the panel, so I decided to try my luck at drawing and ordering a PCB that would have just enough components for me to play around with.

Drawing the PCB

I’m far from being even a passable expert on electronics, but I could at least figure out some of the things I wanted from a “smart controller” unit to attach to this aircon. So I started with a list of requirements:

  • Since I wanted it to use ESPHome, it should be designed around an ESP32 module. I already attempted this once with the acrylic lamps, and I have yet to get a working board out of that. On the other hand, this time I’m much less space constrained, so I decided to go for a full DEVKIT module, one of those with already the full board of regulators, USB port and serial adapters. This turned out to be a further blessing in disguise, since the current chip shortage appears to have affected the CP2104 module I used in my previous design and I wouldn’t have been able to replicate it.
  • While I don’t expect that the HVAC power supply has been limited in power significantly (after all there’s even more deluxe WiFi enable controllers in other versions), I didn’t want to increase the load on the 12V supply significantly. Which meant I went for the more complex, but also more efficient, route of building in a buck converter to 3.3V to power up the ESP32.
  • Also, I really know that relying on my code for “enjoyment-critical” use cases can be frustrating, I wanted a physical way to hard-disconnect a possibly misbehaving automation, and go back to use the old controller, without having to fidget with cables.

With these conditions, and the assumption that the twelve bytes I was seeing were being sent directly from the controller to the HVAC, I drew and manufactured the above board. Feel free to spot the error at the top of the board, if you may.

Now, since JLCPCB turnaround is usually fairly fast, I went ahead and got that manufactured while I was still fighting with figuring out the checksum. So when the boards arrived and I populated them, I was planning on just keep changing settings to find more possible combinations of bytes to see how the checksum would behave.

And that’s when I found out I was very wrong in my assumption, and it’s possible that either the reverse engineering notes I’ve seen for other are missing a big chunk of information, or LG has so many different ways to achieve roughly the same endgame. One I powered up the panel from the bench supply, then I could see that the panel was rather only sending six bytes, rather than the twelve I expected. It’s a bidirectional communication on a single wire, a bus.

That meant going back to the literal drawing board, find the right components to implement this, and start what turned out to be a much large sidequest of complicating matters.

Ten Years of IPv6, Maybe?

It seems like it’s now a tradition that, once a year, I will be ranting about IPv6. This usually happens because either I’m trying to do something involving IPv6 and I get stumped, or someone finds one of my old blog posts and complains about it. This time is different, to a point. Yes I sometimes throw some of my older post out there, and they receive criticism in the form of “it’s from 2015” – which people think is relevant, but isn’t, since nothing really changed – but the occasion this year is celebrating the ten years anniversary for the World IPv6 Day, the so-called 24-hour test of IPv6 from the big players of network services (including, but not limited to, my current and past employer).

For those who weren’t around or aware of what was going on at the time, this was a one-time event in which a number of companies and sites organized to start publishing AAAA (IPv6) records for their main hostnames for a day. Previously, a number of test hostnames existed, such as ipv6.google.com, so if you wanted to work on IPv6 tech you could, but you had to go and look for it. The whole point of the single day test was to make sure that users wouldn’t notice if they started using the v6 version of the websites — though as history will tell us now, a day was definitely not enough to find that many of the issues around it.

For most of these companies it wasn’t until the following year, on 2012-06-06, that IPv6 “stayed on” on their main domains and hostnames, which should have given enough time to address whatever might have come out of the one day test. For a few, such as OVH, the test looked good enough to keep IPv6 deployed afterwards, and that gave a few of us a preview of the years to come.

I took part to the test day (as well as the launch) — at the time I was exploring options for getting IPv6 working in Italy through tunnels, and I tried a number of different options: Teredo, 6to4, and eventually Hurricane Electric. If you’ve been around enough in those circle you may be confused by my lack of Sixxs as an option — I have encountered their BOFH side of things, and got my account closed for signing up with my Gmail address (that was before I started using Gmail For Business on my own domain). Even when I was told that if I signed up with my Gentoo address I would have had extra credits, I didn’t want to deal with that behaviour, so I just skipped on the option.

So ten years on, what lessons did I learn about IPv6?

It’s A Full Stack World

I’ve had a number of encounters with self-defined Network Engineers, who think that IPv6 just needs to be supported at the network level. If your ISP supports IPv6, you’re good to go. This is just wrong, and shouldn’t even need to be debated, but here we are.

Not only supporting IPv6 requires using slightly different network primitives at times – after all, Gentoo has had an ipv6 USE flag for years – but you need to make sure anything that consumes IP addresses throughout your application knows how to deal with IPv6. For an example, take my old post about Telegram’s IPv6 failures.

As far as I know their issue is solved, but it’s far from uncommon — after all it’s an obvious trick to feed legacy applications a fake IPv4 if you can’t adapt them quickly enough to IPv6. If they’re not actually using it to initiate a connection, but only using it for (short-term) session retrieval or logging, you can get away with this until you replace or lift the backend of a network application. Unfortunately that doesn’t work well when the address is showed back to the user — and the same is true for when the IP needs to be logged for auditing or security purposes: you cannot map arbitrary IPv6 into a 32-bit address space, so while you may be able to provide a temporary session identifier, you would need to have something mapping the session time and the 32-bit identifier together, to match the original source of the request.

Another example of where the difference between IPv4 and IPv6 might cause hard to spot issues is in anonymisation. Now, I’m not a privacy engineer and I won’t suggest that I’ve got a lot of experience in the field, but I have seen attempts at “anonymising” user IPv4s by storing (or showing) only the first three octets of it. Beside the fact that this doesn’t work if you are trying to match up people within a small pool (getting to the ISP would be plenty enough in some of those cases), this does not work with IPv6 at all — you can have 120 of the 128 bits of it and still pretty much being able to identify a single individual.

You’re Only As Prepared As Your Dependencies

This is pretty much a truism in software engineering in general, but it might surprise people that this applies to IPv6 even outside of the “dependencies” you see as part of your code. Many network applications are frontends or backends to something else, and in today’s world, with most things being web applications, this is true for cross-company services too.

When you’re providing a service to one user, but rely on a third party to provide you a service related to that user, it may very well be the case that IPv6 will get in your way. Don’t believe me? Go back and read my story about OVH. What happened there is actually a lot more common than you would think: whether it is payment processors, advertisers, analytics, or other third party backends, it’s not uncommon to assume that you can match the session by the source address and time (although that is always very sketchy, as you may be using Tor, or any other setup where requests to different hosts are routed differently.

Things get even more complicated as time goes by. Let’s take another look at the example of OVH (knowing full well that it was 10 years ago): the problem there was not that the processor didn’t support IPv6 – though it didn’t – the problem was that the communication between OVH (v6) and the payment processor (v4) broke down. It’s perfectly reasonable for the payment processor to request information about the customer that the vendor is sending through, through a back-channel: if the vendor is convinced they’re serving an user in Canada, but the processor is receiving a credit card from Czechia, something smells fishy — and payments are all about risk management after all.

Breaking when using Tor is just as likely, but that can also be perceived as a feature, from the point of view of risk. But when the payment processor cannot understand what the vendor is talking about – because the vendor was talking to you over v6, and passed that information to a processor expecting v4 – you just get a headache, not risk management.

How did this become even more complicated? Well, at least in Europe a lot of payment processors had to implement additional verification through systems such as 3DSecure, Verified By Visa, and whatever Amex calls it. It’s often referred to as Strong Customer Authentication (SCA), and it’s a requirement of the 2nd Payment Service Directive (PSD2), but it has existed for a long while and I remember using it back before World IPv6 Day as well.

With SCA-based systems, a payment processor has pretty much no control on what their full set of dependencies is: each bank provides their own SCA backend, and to the best of my understanding (with the full disclosure that I never worked on payment processing systems), they all just talk to Visa and MasterCard, who then have a registry of which bank’s system to hit to provide further authentication — different banks do this differently, with some risk engine management behind that either approves straight away, or challenges the customer somehow. American Express, as you can imagine, simplifies their own life by being both the network and the issuer.

The Cloud is Vastly V4

This is probably the one place where I’m just as confused as some of the IPv6 enthusiast. Why do neither AWS nor Google Cloud provide IPv6 as an option, for virtual machines, to the best of my knowledge?

If you use “cloud native” solutions, at least on Google Cloud, you do get IPv6, so there’s that. And honestly if you’re going all the way to the cloud, it’s a good design to leverage the provided architecture. But there’s plenty of cases in which you can’t use, say, AppEngine to provide a non-HTTP transport, and having IPv6 available would increase the usability of the protocol.

Now this is interesting because other providers go different ways. Scaleway does indeed provide IPv6 by default (though, not in the best of ways in my experience). It’s actually cheaper to run on IPv6 only — and I guess that if you do use a CDN, you could ask them to provide you a dual-stack frontend while talking to them with an IPv6-only backend, which is very similar to some of the backend networks I have designed in the past, where containers (well before Docker) didn’t actually have IPv4 connectivity out, and they relied on a proxy to provide them with connections to the wide world.

Speaking of CDNs – which are probably not often considered part of Cloud Computing but I will bring them in anyway – I have mused before that it’s funny how a number of websites that use Akamai and other CDNs appear to not support IPv6, despite the fact that the CDNs themselves do provide IPv6-frontend services. I don’t know for sure this is not related to something “silly” such as pricing, but certainly there are more concerns to supporting IPv6 than just “flipping a switch” in the CDN configuration: as I wrote above, there’s definitely full-stack concern with receiving inbound connections coming via IPv6 — even if the service does not need full auditing logs of who’s doing what.

Privacy, Security, and IPv6

If I was to say “IPv6 is a security nightmare”, I’d probably get a divided room — I think there’s a lot of nuance needed to discuss privacy and security about IPv6.

Privacy

First of all, it’s obvious that IPv6 was designed and thought out at a different time than the present, and as such it brought with it some design choices that, looking at them nowadays, look wrong or even laughable. I don’t laugh at them, but I do point out that they were indeed made with a different idea of the world in mind, one that I don’t think is reasonable to keep pining for.

The idea that you can tie up an end-user IPv6 with the physical (MAC) address of an adapter is not something that you would come up with in 2021 — and indeed, IPv6 was retrofitted with at least two proposals for “privacy-preserving” address generation option. After all, the very idea of “fixed” MAC addresses appear to be on the way out — mobile devices started using random MAC addresses tied to specific WiFi networks, to reduce the likeliness of people being tracked between different networks (and thus different locations).

Given that IPv6 is being corrected, you may expect then that the privacy issue is now closed, but I don’t really believe that. The first problem is that there’s no real way to enforce what your equipment inside the network will do from the point of view of the network administrator. Let me try to build a strawman for this — but one that I think is fairly reasonable, as a threat model.

While not every small run manufacturer would go out of their way to be assigned an OUI to give their devices a “branded” MAC address – many of the big ones don’t even do that and leave the MAC provided by the chipset vendor – there’s a few of them who do. I know that we did that at one of my previous customers, where we decided to not only getting an OUI to use for setting the MAC addresses of our devices — but we also used it as serial number for the device itself. And I know we’re not alone.

If some small-run IoT device is shipped with a manufacturer-provided MAC address with their own OUI, it’s likely that the addresses themselves are predictable. They may not quite be sequential, and they probably won’t start from 00:00:01 (they didn’t in our case), but it might be well possible to figure out at least a partial set of addresses that the devices might use.

At that point, if these don’t use a privacy-preserving ephemeral IPv6, it shouldn’t be too hard to “scan” a network for the devices, by calculating the effective IPv6 on the same /64 network from a user request. This is simplified by the fact that, most of the time, ICMP6 is allowed through firewalls — because some of it is needed for operating IPv6 altogether, and way too often even I left stuff more open than I would have liked to. A smart gateway would be able to notice this kind of scans, but… I’m not sure how most routers do with things like this, still. (As it turns out, the default UniFi setup at least seems to configure this correctly.)

There’s another issue — even with privacy extensions, IPv6 addresses are often provided by ISPs in the form of /64 networks. These networks are usually static per subscriber, as if they were static public IPv4, which again is something a number of geeks would like to have… but has also side effects of being able to track a whole household in many cases.

This is possibly controversial with folks, because the move from static addresses to dynamic dialup addresses marks the advent of what some annoying greybeards refer to as Eternal September (if you use that term in the comments, be ready to be moderated away, by the way). But with dynamic addresses came some level of privacy. Of course the ISP could always figure out who was using a certain IP address at a given time, but websites wouldn’t be able to keep users tracked day after day.

Note that the dynamic addresses were not meant to address the need for privacy; it was just incidental. And the fact that you would change addresses often enough that the websites wouldn’t track you was also not by design — it was just expensive to stay connected for long period of times, and even on flat rates you may have needed the phone line to make calls, or you may just have lost connection because of… something. The game (eventually) changed with DSLs (and cable and other similar systems), as they didn’t hold the phone line busy and would be much more stable, and eventually the usage of always-on routers instead of “modems” connected to a single PC brought the whole cycling to a new address a rare occurrence.

Funnily enough, we once again some tidbit of privacy in this space with the advent of carrier-grade NATs (CGNAT), which were once again not designed at all for this. But since they concentrated multiple subscribers (sometimes entire neighbourhoods or towns!) into a single outbound IP address, they would make it harder to tell the person (or even the household) accessing a certain website — unless you are the ISP, clearly. This, by the way, is kind of the same principle that certain VPN providers use nowadays to sell their product as a privacy feature; don’t believe them.

This is not really something that the Internet was really designed — protection against tracking didn’t appear to be one of the worries of an academic network that considered the idea that each workstation would be directly accessible to others, and shared among different users. The world we live in, with user devices that are increasingly single-tenant, and that are not meant to be accessible by anyone else on the network, is different from what the original design of this network was visualizing. And IPv6 carried on with such a design to begin with.

Security

Now on the other hand there’s the problem with actual endpoint security. One of the big issues with network protocols is that firewalls are often designed with one, and only one protocol in mind. Instead of a strawman, this time let me tlak about an episode from my past.

Back when I was in high school, our systems lab was the only laboratory that had a mixed IP (v4, obviously, it was over 16 years ago!) and IPX network, since one of the topics that they were meant to teach us about was how to set up a NetWare network (it was a technical high school). All of the computers were set up with Windows 95, with the little security features that were possible to use, and included a software firewall (I think was ZoneAlarm, but my memory is fuzzy around this point). While I’m sure that trying to disable it or work it around would have worked just as well, most of us decided to not even try: Unreal Tournament and ZSNES worked over IPX just as well, and ZoneAlarm had no idea of what we were doing.

Now, you may want to point out that obviously you should make sure to secure your systems to work for both IP generations anyway. And you’d be generally right. But given that sometimes systems have been lifted and shifted many times, it might very well be that there’s no certainty that a legacy system (OS image, configuration, practice, network, whatever it is) can be safely deployed on a v6 world. If you’re forced to, you’ll probably invest money and time to make sure that it is the case — if you don’t have an absolute need beside the “it’s the Right Thing To Do”, you most likely will try to live as much as you can without it.

This is why I’m not surprised to hear that for many sysadmins out there, disabling IPv6 is part of the standard operating procedure of setting up a system, whether it is a workstation or a server. This is not even helped by the fact that on Linux it’s way easy to forget that ip6tables is different from iptables (and yes I know that this is hopefully changing soon).

Software and Hardware Support

Probably this is the aspect of the IPv6 fan base where I feel myself most at home with. For operating systems and hardware (particularly network hardware) to not support IPv6 in 2021 it feels like we’re being cheated. IPv6 is a great tool for backend networks, as you can avoid a lot of legacy features of IPv4 quickly, and use cascading delegation of prefixes in place of NAT (I’ve done this, multiple times) — so not supporting it at the very basic level is damaging.

Microsoft used to push for IPv6 with a lot of force: among other things, they included Miredo/Teredo in Windows XP (add that to the list of reasons why some professionals still look at IPv6 suspiciously). Unfortunately WSL2 (and as far as I understand HyperV) do not allow using IPv6 for the guests on a Windows 10 workstation. This has gotten in my way a couple of times, because I am otherwise used to just jump around my network with IPv6 addresses (at least before Tailscale).

Similarly, while UniFi works… acceptably well with IPv6, it is still considering it an afterthought, and they are not exactly your average home broadband router either. When even semi-professional network equipment manufacturers can’t get you a good experience out of the box, you do need to start asking yourself some questions.

Indeed, if I could have a v6-only network with NAT64, I might do that. I still believe it is useless and unrealistic, but since I actually do develop software in my spare time, I would like to have a way to test it. It’s the same reason why I own a number of IDN domains. But despite having a lot of options for 802.1x, VLAN segregation, and custom guest hotspots, there’s no trace of NAT64 or other goodies like that.

Indeed, the management software is pretty much only showing you IPv4 addresses for most things, and you need to dig deep to find the correct settings to even allow IPv6 on a network and set it up correctly. Part of the reason is likely that the clients have a lot more weight when it comes to address selection than in v4: while DHCPv6 is a thing, it’s not well supported (still not supported at all on WiFi on Android as far as I know — all thanks to another IPv6 “purist”), and the router advertising and network discovery protocols allow for passive autoconfiguration that, on paper, is so much nicer than the “central authority” of DHCP — but makes it harder to administer “centrally”.

iOS, and Apple products in general, appear to be fond of IPv6. More than Android for sure. But most of my IoT devices are still unable to work on an IPv6-only network. Even ESPHome, which is otherwise an astounding piece of work, does not appear to provide IPv6 endpoints — and I don’t know how much of that is because the hardware acceleration is limited to v4 structures, and how much of it is just because it’s possibly consuming more memory in such small embedded device. The same goes for CircuitPython when using the AirLift FeatherWing.

The folks who gave us the Internet of Things name sold the idea of every device in the world to be connected to the Internet through a unique IPv6 address. This is now a nightmare for many security professionals, a wet dream for certain geeks, but most of all an unrealistic situation that I don’t expect will become reality in my lifetime.

Big Names, Small Sites

As I said at the beginning explaining some of the thinking behind World IPv6 Day and World IPv6 Launch, a number of big names, including Facebook and Google, have put their weight behind IPv6 from early on. Indeed, Google keeps statistics of IPv6 usage with per-country split. Obviously these companies, as well as most of the CDNs, and a number of other big players such as Apple and Netflix, have had time, budget, and engineers to be able to deploy IPv6 far and wide.

But as I have ventured before, I don’t think that they are enough to make a compelling argument for IPv6 only networks. Even when the adoption of IPv6 in addition to IPv4 might make things more convenient for ISPs, the likeliness of being able to drop tout-court IPv4 compatibility is approximately zero, because the sites people actually need are not going to be available over v6 any time soon.

I’m aware of trackers (for example this one, but I remember seeing more of those) that tend to track the IPv6 deployment for “Alexa Top 500” (and similar league tables) websites. But most of the services that average people care about don’t seem to be usually covered by this.

The argument I made in the linked post boils down to this: the day to day of an average person is split between a handful of big name websites (YouTube, Facebook, Twitter), and a plethora of websites that are very sort of global. Irish household providers are never going to make any of the Alexa league tables — and the same is likely true for most other countries that are not China, India, or the United States.

Websites league tables are not usually tracking national services such as ISPs, energy suppliers, mobile providers, banks and other financial institutions, grocery stores, and transport companies. There are other lists that may be more representative here, such as Nielsen Website Ratings, but even those are targeted at selling ad space — and suppliers and banks are not usually interested in that at all.

So instead, I’ve built my own. It’s small, and it mostly only cares about the countries I experienced directly; it’s IPv6 in Real Life. I’ve tried listing a number of services I’m aware of, and should give a better idea of why I think the average person is still not using IPv6 at all, except for the big names we listed above.

There’s another problem with measuring this when resolving hosts (or even connecting to them — which I’m not actually doing in my case). While this easily covers the “storefront” of each service, many use separate additional hosts for accessing logged-in information, such as account data. I’m covering this by providing a list of “additional hosts” for each main service. But while I can notice where the browser is redirected, I would have to go through the whole network traffic to find all the indirect hosts that each site connects to.

Most services, including the big tech companies often have separate hosts that they use to process login requests and similar high-stake forms, rather than using their main domain. Or they may use a different domain for serving static content, maybe even from a CDN. It’s part and parcel of the fact that, for the longest time, we considered hostnames to be a security perimeter. It’s also a side effect of wanting to make it easier to run multiple applications written in widely different technologies — one of my past customers did exactly this using two TLDs: the marketing pages were on a dot-com domain, while the login to the actual application would be on the dot-net one.

Because of this “duality”, and the fact that I’m not really a customer of most of the services I’m tracking, I decided to just look at the “main” domain for them. I guess I could try to aim higher and collect a number of “service domains”, but that would be a point of diminishing returns. I’m going to assume that if the main website (which is likely simpler, or at least with fewer dependencies) does not support IPv6, their service domains don’t, either.

You may noticed that in some cases, smaller companies and groups appear to have better IPv6 deployments. This is not surprising: not only you can audit smaller codebases much faster than the extensive web of dependencies of big companies’ applications — but also the reality of many small businesses is that the system and network administrators do get a bit more time to learn and apply skills, rather than having to follow through a stream of tickets from everyone in the organization that is trying to deploy something, or has a flaky VPN connection.

It also makes it easier for smaller groups to philosophically think of “what’s the Right Thing To Do” versus the medium-to-big company reality of “What does the business get out of spending time and energy on deploying this?” To be fair, it looks like Apple’s IPv6 requirements might have pushed some of the right buttons for that — except for the part where they are not really requiring the services in use by the app to be available on IPv6, it’s acceptable for the app to connect with NAT64 and similar gateways.

Conclusions

I know people paint me as a naysayer — and sometimes I do feel like one. I fear that IPv6 is not going to become the norm during my lifetime, definitely not my career. It is the norm to me, because working for big companies you do end up working with IPv6 anyway. But most users won’t have to care for much longer.

What I want to point out to the IPv6 enthusiast out there is that the road to adoption is harsh, and it won’t get better any time soon. Unless some killer application of IPv6 comes out, where supporting v4 is no longer an option, most smaller players won’t bother. It’s a cost to them, not an advantage.

The performance concerns of YouTube, Netflix, or Facebook will not apply to your average European bank. The annoyance of going through CGNAT that Tailscale experience is not going to be a problem for your smart lightbulb manufacturer who just uses MQTT.

Just saying “It’s the Right Thing To Do” is not going to make it happen. While I applaud those who are actually taking the time to build IPv6-compatible software and hardware, and I think that we actually need more of them taking the pragmatic view of “if not now, when?”, this is going to be a cost. And one that for the most part is not going to benefit the average person in the medium term.

I would like to be wrong, honestly. I would want to wish that next year I’ll magically get firmware updates for everything I have at home and be working with IPv6 — but I don’t think I will. And I don’t think I’ll replace everything I own just because we ran out of address space in IPv4. It would be a horrible waste, to begin with, in the literal sense. The last thing we want, is to tell people to throw away anything that does not speak IPv6, as it would just pile up as e-waste.

Instead I wish that more IPv6 enthusiasts would get to carry the torch of IPv6 while understanding that we’ll live with IPv4 for probably the rest of our lives.

Home Automation: Physical Contact

In the previous post on the matter, I described the setup of lighting solutions that we use in the current apartment, as well as the previous apartment, and my mother’s house. Today I want to get into a little bit more detail of how we manage to use all of these lights without relying solely on our phones, or on the voice controlled assistants.

First of all, yes, we do have both Google Assistant and Alexa at home. I only keep Assistant in the office because it’s easiest to disable the mic on it with the physical switch, but otherwise we like the convenience of asking Alexa for the weather, or let Google read Sarah Millican for us. To make things easier to integrate, we also signed up for Nabu Casa to integrate our Home Assistant automations with them.

While this works fairly decently for most default cases, sometimes you don’t want to talk, for instance because your partner is asleep or dozing off, and you still want to control the lights (or anything else) without talking. The phone (with the Home Assistant app) is a good option, but it is often inconvenient, particularly if you’re going around the flat with pocketless clothes.

As it turns out, one of the good things that smart lights and in general IoT home automation bring to the table, is the ability to add buttons, which usually do not need to be wired into anything, and that can be placed just about anywhere. These buttons also generally support more than one action connected to them (such as tap, double-tap, and hold), which should allow providing multiple controls at a single position more easily. But there are many options for buttons, and they are not generally compatible with each other, and I got myself confused for a long while.

So to save the day, Luke suggested me some time ago to look into Flic smart buttons, which were actually quite the godsend for us, particularly before we had a Home Assistant set up at all. The way these work is that they are Bluetooth LE devices, that can pair with a proprietary Bluetooth LR “hub” (or with your phone). The hub can either connect with a bunch of network services, or work with a variety of local network devices, as well as send arbitrary HTTP requests if you configure it to.

While Flics were our first foray into adding physical control to our home automation, I’m not entirely sure if I would recommend them now. While they are quite flexible at a first glance, they are less than stellar in a more complex environment. For instance, while the Flic Hub can talk directly to LIFX lights on local network (awesome, no Internet dependency), it doesn’t have as much control on the results of that: when we used the four LIFX spots in the previous flat’s office, local control was unusable, as nearly every other click would be missing one spot, making it nearly impossible to synchronise. Thankfully, LIFX is also available as a “Cloud” integration, that could handle the four lights just fine.

The Flic Hub can talk to a Hue Bridge as well, to toggle lights and select scenes, but this is still not as well integrated as an original Hue Smart Button: the latter can be configured to cycle between scenes on multiple taps, while I could only find ways to turn the light on or off, or to select one scene per action with the default Flic interface.

We also wanted to use Flic buttons to control some of the Home Assistant interactions. While buttons on the app are comfortable, and usually Google Assistant can understand when we say “to the bedroom”, there are other times when we could rather use a faster response than the full Google Assistant round-trip. Unfortunately this is an area where Flic leaves a lot to be desired.

First of all, the Flic Hub does not support IPv6 (surprise), which meant I can’t just point it at my Home Assistant hostname, I need to use the internal IPv4 address. Because of that, it also cannot validate the TLS certificate either. Second, Flic does not have an Home Assistant native integration: for both Hue and LIFX, you can configure the Hub against a Cloud account or the local bridge, then configure actions to act on the devices connected to them, but for Home Assistant there is nothing, so the options are limited to setting up manual HTTP requests.

This is where things get fairly annoying. You can either issue a bearer token to login to Home Assistant, in which case you can configure the Flic to execute a script directly, or you can use the webhook trigger to send the action to Home Assistant and handle it there. The former appears to be slightly more reliable in my experience, although I have not figured out if it’s the webhook request not being sent by the hook, or the fact that HA is taking time to execute the automations attached to it; I should spend more time debugging that, but I have not had the time. Using the Bearer Tokens is cumbersome, though. Part of the problem is that the token itself is an extremely long HTTP header, and while you can add custom headers to requests in the Flic app, the length of this header means you can’t copy it, or even remove it. If you need to replace the token you need to forget the integration with the HTTP request and create a new one altogether.

Now on the bright side, Flic has recently announced (by email, but not on their blog) that they launched a Software Development Kit that allows writing custom integrations with the Flic Hub. I have not looked deeply into it, because I have found other solutions that work for me to augment my current redundant Flics, but I would hope that it means we will have a better integration with Home Assistant one day in the future.

To explain what the better alternatives we’re using are, we need to point out the obvious one first: the native Hue smart buttons. As I said in the previous post, I did get them for my mother, so that the lights on the staircase turn on and off the same way as they did before I fixed the wiring. We considered getting them here, but it turns out that those buttons are not cheap. Indeed, the main difference between different buttons we have considered (or tried, or at are using) is to be found in the price. At the time of writing, the Hue buttons go for around £30, Flics for £20, Aqara (Xiaomi) buttons for around £8 and Ikea ones for £6 (allegedly).

So why not using cheaper options? Well, the problem is that all of these (on paper) require different bridges. Hue, Aqara and Ikea are all ZigBee based, but they don’t interoperate. They also have different specs and availability. The Aqara buttons can be easily ordered from AliExpress and they are significantly cheaper than the equivalent from Hue, but they are also bigger, and of a shape just strange enough to make it awkward to place next to the wallplates with the original switches of the apartment, unlike both Flic and Hue. The Ikea ones are the cheapest, but unless you have a chance to pop in their store, it seems like they won’t be shipping in much of a hurry. As I write this blog post, it’s been nearly three weeks that I ordered them and they still have not shipped, with the original estimate for delivery of just over a month — updated before even posting this: Ikea (UK) cancelled my order and the buttons are no longer available online, which meant I also didn’t get the new lights that I was waiting for. Will update if any new model becomes available. In the meantime I checked the instructions and it looks like these buttons only support a single tap action.

This is where things get more interesting thanks to Home Assistant. Electrolama sells a ZigBee stick that is compatible with Home Assistant and that can easily integrate with pretty much any ZigBee device, including the Philips Hue lights and the Aqara buttons. And even the Aqara supports tap, double-tap, and hold in the same fashion as Flic, but with a lot less delay and no lost event (again, in my experience). It turned out that at the end of the day for us the answer is to use cheaper buttons from AliExpress and configure those, rather than dealing with Flic, though at the moment we have not removed the Flics around the apartment at all, and we rather have decided to use them for slightly different purposes, for automation that can take a little bit more time to operate.

Indeed, the latency is the biggest problem of using Flic with Home Assistant right now: even when event is not going lost, it can sometimes take a few seconds before the event is fully processed, and in that time, you probably would have gotten annoyed enough that you would have asked a voice assistant, which sometimes causes the tap to be registered after the voice request, turning the light back off. Whereas, the Aqara button is pretty much instantaneous. I’m not entirely sure what’s going on there, it feels like the bridge is “asleep” and can’t send the request fast enough for Home Assistant to register it.

It is very likely that we would be replacing at least a couple of the Flics we already have set up with the Aqara buttons, when they arrive. They support the same tap/double-tap/hold patterns as the Flic, but are significantly lower latency. Although they are bigger, and they do seem to have very cheap brittle plastic, I nearly made it impossible to change the battery on my first one, because trying to open the compartment with a 20p coin completely flattened the back!

Once you have working triggers with ZigBee buttons, by the way, connecting more controls become definitely easier. I really would consider making a “ZigBee streamerdeck” to select the right inputs on the TV, to be honest. Right now we have one Flic to double-tap to turn on the Portal (useful in case one of our mothers is calling), and another one to select PS4 (tap), Switch (double-tap), or Kodi (hold).

Wiring automation, and selection of specific scenes, is the easiest thing you can do in Home Assistant, so you get a lot of power for a little investment in time, from my point of view. I’m really happy to have finally set it up just the way I want it. Although it’s now time to consider updating the setup to no longer assume that either of us is always at home at any time. You know, with events happening again, and the lockdown end in sight.

What’s Up With CircuitPython in 2021?

Last year, as I started going through some more electronics projects, both artsy and not, I have lauded the ease that Circuit Python adds to building firmware for sometimes not too simple devices, particularly when combined with Adafruit’s Feather line up of boards with a fixed interchangable interface.

As I noted last month, though, it does feel like the situation in the last year didn’t improve, but rather become more messy and harder to use. Let me dig a bit more into what I think is the problem.

When I worked on my software defined remote control, I stopped at a checkpoint that was usable, but a bit clunky. In this setup, I was using an Feather nRF52840, because the M0 alternatives didn’t work out for me, but I was also restarting it before each command was sent, because of… issues.

Indeed, I found my use case turned out to be fairly niche. On the M0, the problem was that the carrier wave signal used to send the signal to the Bravia TV was too off in terms of duty cycle, which I assume was caused by the underpowered CPU running the timers. On the nRF52840 the problem was instead that, if I started both transmitters (for the Bravia and for the HDMI switch), one of them would be stuck high.

So what I was setting up the stream to do (which ended up not working due to muEditor acting up), was to test running the same code on three more feathers: the Feather RP2040, the FeatherS2, and the Feather M4. These are quite more powerful than the others, and I was hoping to find that they would work much better for my use case — but turned out to tell me something entirely different.

The first problem is that the Circuit Python interfaces are, as always, changing. And they change depending on the device. This hit me before with the Trinket M0 not implementing time the same way as the Feather M0, but in this case things changed quite a bit. Which meant that I couldn’t use the same code I was using on the nRF Feather on the two newest Feathers: it worked fine on the M4, it needed some tweaking for the FeatherS2, and couldn’t work at all on the RP2040.

Let’s start with the RP2040, the same way I wanted to start the stream. Turns out the RP2040 does not have PulseOut support at all, and this is one of the problems with the current API selection in Circuit Python: what you see before you try and look into the details, is that the module (pulseio) is present, but if you were to break down the features supported inside, you would find that sending a train of pulses with that module is not supported on the RP2040 (at least as of Circuit Python 6.2.0).

Now, the good part is that RP2040 supports Programmable I/O (PIO), which would make things a lot more interesting, because the pulsing out could be implemented in the “pioasm” language, most likely. But that requires also figuring out a lot more than I originally planned on doing: I thought I would at least reuse the standard pulsing out for the SIRC transmitter, as that does need a 40 kHz carrier, and the harder problem was to have a non-carried signal for the switch (which I feed directly into the output of the IR decoder).

So instead I dropped off to the next in the list, the FeatherS2, which I was interested in because it also supports WiFi natively, instead of through the AirLift. And that was the even more annoying discovery: the PulseOut implementation for the ESP-S2 port doesn’t allow using a PWM carrier object, it needs the parameters instead. This is not documented — if you do use the PulseOut interface as documented, you’re presented with an exception stating Port does not accept PWM carrier. Pass a pin, frequency and duty cycle instead. At the time of writing that particular string only has two hits on Google, both coming from a Stanford’s course material that included the Circuit Python translation files, so hopefully this will soon be the next hit on the topic.

Unfortunately there is also no documented way to detect whether you need to pass the PWM or the set of raw parameters, which I solved in pysirc by checking for the platform the code is running in:

_PULSEOUT_NO_CARRIER_PLATFORMS = {"Espressif ESP32-S2"}
if sys.platform in _PULSEOUT_NO_CARRIER_PLATFORMS:
    pulseout = pulseio.PulseOut(
        pin=pin, frequency=carrier_frequency, duty_cycle=duty_cycle
    )
else:
    pwm = pulseio.PWMOut(
        pin, frequency=carrier_frequency, duty_cycle=duty_cycle
    )
    pulseout = pulseio.PulseOut(pwm)

Since this is not documented, there is also no guarantee that this isn’t going to be changing in the future, which is fairly annoying to me, but is not a big deal. But it also is not the only place where you end up having to rely on sys.platform for the FeatherS2. The other place is the pin names.

You see, one of the advantages of using Feathers for rapid prototyping, to me, was the ability to draw a PCB for a Feather footprint, and know that, as long as I checked that the pins were not being used by the combination of wings I needed (for instance the AirLift that adds WiFi support forces you to avoid a few pins, because it uses them to communicate), I would be able to take the same Circuit Python code and run it with any of the other feathers without worry.

But this is not the case for the FeatherS2: while obviously the ESP32 does not have the same set of GPIO as, say, the RP2040, I would have expected that the board pin definition would stay the same — not so. Indeed, while my code originally used board.D5 and board.D6 for the outputs, in the case of the FeatherS2 I had to use board.D20 and board.D21 to make use of the same PCB. And I had to do that by checking the sys.platform, because nothing else was telling.

The worst part of that? These pin definitions conflict between the FeatherS2 and the Feather M4. board.D20 is an on-board components, while board.D21 is SDA. On the FeatherS2, SDA is board.D10, and the whole right side of the pins is scrambled, which can cause significant conflicts and annoying debugging if using the same code in the same PCB with M4 versus S2.

This is what made me really worried about the direction of Circuit Python — when I looked at it last year it was a very good HAL (Hardware Abstraction Layer) that allowed you to write code that can work in multiple boards. And while some of the features obviously wouldn’t be available (such as good floating-point on the M0), those that would be were following the same API. Nowadays? Every port appears to have a slight different API, and different devices with the same form factor need different code. Which is kind of understandable, as the amount of supported boards increased significantly, but it still goes against the way I was hoping it would go.

Now, if the FeatherS2 worked for my needs, that would have been good enough. After all, I adapted pysirc to work with the modified API, so I thought I would be ready. But then something else caught my attention: as I said above, the signal sent to the switch doesn’t need to be modulated on top of a carrier wave. Indeed, if I did do that, the switch wouldn’t recognize the commands. So for that I would like a 100% duty cycle, but the FeatherS2 doesn’t have that as an option. Nor does it seem to be able to set a 99% cycle either. So I found myself looking at a very dirty signal compared to what I wanted.

Image
What was supposed to be a continuous 9ms pulse on Saleae’s Logic software.

Basically, the FeatherS2 is not fit for purpose when it comes to the problem I needed solving. So at the end of the day, I ended up just going back to what is in my opinion the best implementation for the form factor: the Feather M4. This worked fine for sending the two signals without needing to restart the Feather (unlike on the nRF), which shortened the time needed to send messages to TV and switch significantly (yay!) even though it still has a bit of a “dirty” pulse signal when running a 99% duty cycle at 10 Hz:

Image
A slightly noisy 9ms pulse followed by a train of noisy pulses.

But I’m still upset, because I feel Circuit Python is becoming less reliable on the “Write Once, Run Anywhere” suggestion that it was giving last year: you really need to make the code tailored for a particular implementation, instead of being able to rely on the already existing libraries.

Update 2021-05-26

Scott Shawcroft from Circuit Python got in touch as a follow up from this post, and it looks like some of the issues I reported here were either already on the radar (PulseOut for rp2040), re-opened following this post (PulseOut for ESP32-S2), or have now been addressed (Feather pin names). A huge thanks to Scott and the Circuit Python team all!

You can also see here that Scott is well aware of the importance of this all to fit together nicely — so hopefully the FeatherS2 is going to be a one-off special case!

Light Nerdery: Evolving Solutions At Homes

Of all topics, I find myself surprised that I can be considered a bit of a lights nerd. Sure, not as much as Technology Connections, but then again, who is nerdier than him on this? The lightbulb moment (see what I’m doing?) was while I was preparing to write the post that is now going to be in the future, and realized that a lot of what I was about to write needed some references about my experiments with early LED lighting, and what do you know? I wrote about it in 2008! And after all, it was a post about new mirror lights that nearly ended up the last post ever of mine, being posted just a few hours before my trip to the ICU for the pancreatitis.

Basically this is going to be a “context post” or “setup post”: it likely won’t have a call to action, but it will go into explaining the details of why I made certain tradeoffs, and maybe, if you’re interested in making similar changes to your home, it can give you a bit of an idea as well.

To set the scene up, let me describe the three locations that I’ll be describing light setups in. The first is my mother’s house in Venice mainland, which is an ’80s built semi-detached on two floors. The second is the flat in London, when my now wife moved in. And the third one is the current flat we moved into together. You may notice a hole in this setup: Dublin. Despite having lived there for longer than in London, I don’t have anything to talk about when it comes to light setup, or even in general about home; looking back, it sounds like I have actually spent my time in Dublin taking it nearly as a temporary hotel room, and spent very little time in “making it mine”.

A House Of Horrible Wiring

In that 2008 post I linked above, I complained how the first LED lights I bought and set up in my bedroom would keep glowing, albeit much dimmed, when turned off. Thanks to the post and discussions had with a number of people with more clue than me at the time, I did eventually find the reason: like in many other places throughout the house, deviators were used to allow turning the light on and off in different places. In the particular case of my bedroom, a switch by the door was paired with another on a cord, to be left by the bed. But instead of interrupting the live of the mains, they were interrupting the neutral which meant that, turning the light “off” still allowed enough current to go through between live and ground that the LEDs would stay on.

This turned out to be a much, much bigger deal than just a simple matter of LED lights staying on. Interrupting the neutral is not up to regulation: you may still have phase and get shocked if touching the inside of the lampholder, even with the switch turned off, but most importantly, it looks like it causes a number of CFLs electronics to be “stressed”, with their lifespan being seriously shortened by that. This was actually something that we have noticed and complained for many years, but never realised it was connected.

Eventually, I fixed the wiring in my bedroom (and removed the deviator I didn’t quite need), but I also found another wiring disaster. The stairwell at my mother’s had two light points, one on the landing, and one on the top of the stairs, and they were controlled together by switches at the top and bottom of the staircase. As expected, these interrupted the neutral, but most importantly, each position was wired into the live of the floor it was closest to, ignoring the separation of the circuits and passing through a phase even when turning the circuit of. And that explained why no bulb ever lasted more than a year for those.

Unfortunately, fixing the deviators there turned out to be pretty much impossible, due to the way the wiring conducts were set down inside the walls. So instead, I had to make do with separating the two lights, which was not great, but was acceptable while I was still living with my mother: I would turn on the light upstairs for her (I was usually upstairs working) and she would not need the light downstairs. But I had to come up with a solution after I prepared to leave.

The first solution to this was adding a PIR (Passive Infra Red) movement sensor at the bottom of the stairs to turn on the light on the landing. That meant that just taking the first step of the stairs would illuminate enough of the staircase that you could make your way up, and then the timer would turn it off. This worked well enough for a while, to the point that even our cat learned to use it (you could observe her taking a step, wait for the light, then run all the way upstairs).

Then, when I was visiting a couple of years back, I noticed that something wasn’t quite working right with the sensor, so I instead ordered a Philips Hue lightbulb, one of those with BLE in addition to ZigBee, so that it wouldn’t require a bridge. At that point my mother could turn the light on and off with her phone, so that the sensor wasn’t needed anymore (and I removed it from the circuit).

This worked much better for her, but she complained earlier this year that she kept forgetting to turn off the light before she would get to bed, and then she’d have to go back to the staircase as her phone couldn’t reach the light otherwise. What would be a minor inconvenience for me and many of the people reading this, for my mother was a major annoyance, as she starts getting old, so I solved it by getting her a Hue Bridge and a couple of Smart Buttons: the bridge alone meant that her phone didn’t need to reach the light (the bridge would be reachable over WiFi), but the buttons restored a semblance of normality about turning the light on and off as she used to before I re-wired the lights.

This is something that Alec from Technology Connections pointed out on Twitter some time ago: smart lights are, at the very least, a way to address bad placement of lights and switches. Indeed, over ten years later, my mother now has normal-acting staircase lights that do not burn out every six months. Thank you, Internet of Things!

While we won’t be visiting my mother any time soon, due to the current pandemic and the fact she has not been called in for the vaccine yet, once we do I’ll also be replacing the lightbulbs in my bedroom. My office there is now mostly storage, sadly but we have been staying in my old bedroom that still has the same lights I wrote about in 2008. A testament to them lasting much longer now that the wiring is right, given that for the most part I don’t remember spending more than a few months without needing to replace a lightbulb somewhere, when I was living there.

The 2008 lights I chose to keep the light to a minimum before going to bed, which I still think is a good idea, but they do make it hard to clean and sort things out. Back then I was toying with the idea of building a controller to turn on only part of the lights, but nowadays the available answer is to just use smart lights and configure them with separate scenes for bedtime and cleantime. And buttons and apps mean that there is no need anymore for having a separate bedside lamp, for the most part.

Sharing A Flat Planned For One

When I moved to London, i went looking for an apartment that, as I suggested to the relocation aide, would be “soulless” — I had heard all of the jokes about real estate agents defining flats falling apart as having “characters” or big design issues considered “the soul of the building”, so I wanted to make clear I was looking mostly for a modern flat, and something that I would be able to just tweak to my own liking.

I also was looking for an apartment that was meant to be mine, with no plan on sharing. Then I met my (now) wife, and that plan needed some adjustments. One of the things that became obvious early on was that the bedroom lights were not really handy. Like pretty much all of the apartments I lived in outside of Italy, that one had GU-10 spotlight holders scattered throughout the ceiling, and a single switch for them by the door. This worked okay for me alone, as I would just turn the light off and then lay on the bed, but it becomes an awkward dance of elbows when you share a fairly cozy mattress.

So early on I decided to get an IKEA Not floorlamp, and a smart light. I settled for the LIFX Mini, because it supported colours (and I thought it would be cool), didn’t need a bridge, and also worked without Internet connection over the local network. The only fault in my plan was to have gotten a Not with two arms at first, which meant we have a few times turned the wrong light off, until I Sugru’d away the main switch on the cable.

I said “at first” there, because we then realised that this type of light is great not only in the bedroom but also in the living room. When watching TV keeping the light on would be annoying (the spots were very bright), but turning it entirely off would be tiring for the eyes. So we got a second Not lamp, and a second LIFX bulb, and reshuffled them a bit so that the two-arms one moved to the living room, as there the additional spotlight is sometimes useful).

This worked very nicely for the most part, to the point that I considered whether my home office could use some of that. This was back in the beforetimes, where the office (i.e. the second bedroom) would mostly be used to play games in the evening, by my wife when not using the laptop, or in the rare times I would be working from home (Google wouldn’t approve). That meant that in many cases having the full light would also be annoying, and in other cases having very dim light would also not work well. In particular, for a while, I was keeping company to my wife while she played by reading, and I would have wanted light on the side of my seat, but not as much on her side.

Since the office only had four spots, we decided to buy four LIFX GU-10 lights. I’m not linking to those because the price seems to have gone off the charts and they now cost more than twice what we paid for them! These are full-fledged LIFX bulbs, with WiFi and the whole colour range. Lovely devices, but also fairly bulky and I’ll get back to that later on.

Overall, this type of selective smart lighting helped significantly with the enjoyment of the flat, by reducing the small annoyances that would be presenting on a regular basis, like navigating the bedroom in the dark, or having the light too bright to watch TV. So we were looking towards replicating that after moving.

No Plan Survives Contact With The New Flat

When we moved to the new flat we knew that a number of things would have had to be changed. For instance, the new office had six rather than four spots, and the rooms layout meant that some of the choice of where to put the lamps would be not as good as we had previously.

Things got even a little more complicated than that: the LIFX GU-10 bulbs are significantly bigger than your average GU-10 spots, so they didn’t actually fit at all in the receptacle that this flat used! That meant that we couldn’t get them in, even if we decided to only keep four out of the six.

It’s in this situation that we decided to bite the bullet and order a Philips Hue Bridge for our flat, together with six White Ambiance spots: these spots are the same size as normal GU-10 spots, and do not have issues with fitting in the receptacle, so they could be used in the new office. While there is a model supporting colour changes as well as white temperature balance, in the office we never really used colour lighting enough to justify the difference in price (though we did and still do rely on the colour temperature selection).

Once you have a bridge, adding more lights becomes cheaper than buying a standalone light, too. So we also added a “reading light” to the bedroom, which is mostly for me to use if I can’t sleep and I don’t want to wake up my wife.

Another thing we would have liked to do was to replace the clicky switch for the ensuite with a soft/smart button. The reason for that was twofold: the first is that it took us months to get used to the placement of the switch in this flat, the second is that the switch is so damn noisy when clicking that it kept waking the other up when one of us went to use the toilet overnight. Unfortunately changing the switch appears not trivial: with our landlord permission we checked the switch connection, and it is not wired with the correct L/N positions or colours, and I could see a phase on three out of the four positions.

Instead, when we could confirm that the switch did not control the extraction fan, we decided to put on three Hue spots in the bathroom, this time not the temperature controlled once, but just the dimmer ones. And at that point, we could keep a single one at 1% overnight, and not need to turn anything on or off when using the restroom during the night: it gets very dim, so you don’t wake up, but you can still see plenty to use the toilet and wash your hands. This by the way was an idea that came to me after watching some of BigClive videos: the very low level light makes for a good light to have overnight to avoid waking youself up.

To explain just how useful this setup ended up being for us, the ensuite has three spots: one is pretty much in the shower box, the other two are by the door and in the middle. Overnight, we only leave the shower spot running at 1%, with the other two being off. If we need more light, for instance to brush our teeth, or for me to get my insulin, we have an “on” scene in which all three spots are at around 30%. When we’re taking a shower, we only turn on the shower spot to 100%. And when we’re cleaning the bathroom, we set all three to 100%. At maximum light, the new bulbs are brighter than the old ones were; in the “on” scene we use, they are much less bright, because we don’t really need that much light on all the time.

We also have ordered more spots, this time from IKEA, that sells an even cheaper model, although with not as good low-light performance. We could do this because I’ve recently replaced the Hue Bridge for a ZigBee stick on our Home Assistant, and I’ll go into more details about that in a future post. At the time I’m writing this the spots have not arrived yet, but we decided that, now that we’re more likely going to be going out again (we both got our first dose of vaccine, and soon to receive the second), it makes sense to have a bit more control of the light on the entrance.

In particular, the way the entrance is currently set up, we turn on six spots all the time in the T-shaped hallway. When coming back during the day, only one spot would be necessary, to take off shoes and put down keys, the rest of the flat is very bright and does not need illumination. And similarly when just going back and forth during the evening, only the two or three spots illuminating the top of the T would be needed, while the ones at the door are just waste. Again smart lights in this case are helpful to shape inflexible wiring.

Conclusion

I wrote before that I get annoyed at those who think IoT is a waste of time. You can find the IoT naming inane, you can find the whole idea of “smart” lights ridiculous, but however you label it, the ability to spread the control of lightbulbs further than the origin wiring intended is a quality of life improvement.

Indeed, in the case of my mother’s house, it’s likely that the way we’ll solve the remaining problems with wiring will be by replacing most switches with smart lights and floorlamps.

And while I have personally some questions about keeping the “off” lightbulbs connected to something, a quick back of the envelope calculation I did some months ago shows that even just the optimisation of being able to automate turning on and off lights based on presence, or the ability to run the light most of the time at lower power, can easily reduce the yearly energy usage considerably (although not quite to the level that buying all new bulbs would save you money, if you already have LED lights).

So once again, and probably not for the last time, I want to tip off my hat to Home Assistant, Electrolama, and all the other FLOSS projects that work with the wishes of people, rather than against them, to empower them to have smart home technologies they can control.