Fantasyland: in the world of IPv6 only networks

It seems to be the time of the year when geeks think that IPv6 is perfect, ready to be used, and the best thing after sliced bread (or canned energy drinks). Over on Twitter, someone pointed out to me that FontAwesome (which is used by the Hugo theme I’m using) is not accessible over an IPv6-only network, and as such the design of the site is broken. I’ll leave aside my comments on FontAwesome because they are not relevant to the rant at hand.

You may remember I called IPv6-only networks unrealistic two years ago, and I called IPv6 itself a geeks’ wet dream last year. You should then not be surprised to find me calling this Fantasyland an year later.

First of all, I want to make perfectly clear that I’m not advocating that IPv6 deployment should stop or slow down. I really wish it would be actually faster, for purely selfish reasons I’ll get to later. Unfortunately I had to take a setback when I moved to London, as Hyperoptic does not have IPv6 deployment, at least in my building, yet. But they provide a great service, for a reasonable price, so I have no intention to switch to something like A&A just to get a good IPv6 right now.

$ host hyperoptic.com
hyperoptic.com has address 52.210.255.19
hyperoptic.com has address 52.213.148.25
hyperoptic.com mail is handled by 0 hyperoptic-com.mail.eo.outlook.com.

$ host www.hyperoptic.com
www.hyperoptic.com has address 52.210.255.19
www.hyperoptic.com has address 52.213.148.25

$ host www.virginmedia.com
www.virginmedia.com has address 213.105.9.24

$ host www.bt.co.uk
www.bt.co.uk is an alias for www.bt.com.
www.bt.com has address 193.113.9.162
Host www.bt.com not found: 2(SERVFAIL)

$ host www.sky.com
www.sky.com is an alias for www.sky.com.edgekey.net.
www.sky.com.edgekey.net is an alias for e1264.g.akamaiedge.net.
e1264.g.akamaiedge.net has address 23.214.120.203

$ host www.aaisp.net.uk
www.aaisp.net.uk is an alias for www.aa.net.uk.
www.aa.net.uk has address 81.187.30.68
www.aa.net.uk has address 81.187.30.65
www.aa.net.uk has IPv6 address 2001:8b0:0:30::65
www.aa.net.uk has IPv6 address 2001:8b0:0:30::68

I’ll get back to this later.

IPv6 is great for complex backend systems: each host gets their own uniquely-addressable IP, so you don’t have to bother with jumphosts, proxycommands, and so on so forth. Depending on the complexity of your backend, you can containerize single applications and then have a single address per application. It’s a gorgeous thing. But as you move towards user facing frontends, things get less interesting. You cannot get rid of IPv4 on the serving side of any service, because most of your visitors are likely reaching you over IPv4, and that’s unlikely to change for quite a while longer still.

Of course the IPv4 address exhaustion is a real problem and it’s hitting ISPs all over the world right now. Mobile providers already started deploying networks that only provide users with IPv6 addresses, and then use NAT64 to allow them to connect to the rest of the world. This is not particularly different from using an old-school IPv4 carrier-grade NAT (CGN), which a requirement of DS-Lite, but I’m told it can get better performance and cost less to maintain. It also has the advantage of reducing the number of different network stacks that need to be involved.

And in general, having to deal with CGN and NAT64 add extra work, latency, and in general bad performance to a network, which is why gamers, as an example, tend to prefer having a single-stack network, one way or the other.

$ host store.steampowered.com
store.steampowered.com has address 23.214.51.115

$ host www.gog.com
www.gog.com is an alias for gog.com.edgekey.net.
gog.com.edgekey.net is an alias for e11072.g.akamaiedge.net.
e11072.g.akamaiedge.net has address 2.19.61.131

$ host my.playstation.com
my.playstation.com is an alias for my.playstation.com.edgekey.net.
my.playstation.com.edgekey.net is an alias for e14413.g.akamaiedge.net.
e14413.g.akamaiedge.net has address 23.214.116.40

$ host www.xbox.com
www.xbox.com is an alias for www.xbox.com.akadns.net.
www.xbox.com.akadns.net is an alias for wildcard.xbox.com.edgekey.net.
wildcard.xbox.com.edgekey.net is an alias for e1822.dspb.akamaiedge.net.
e1822.dspb.akamaiedge.net has address 184.28.57.89
e1822.dspb.akamaiedge.net has IPv6 address 2a02:26f0:a1:29e::71e
e1822.dspb.akamaiedge.net has IPv6 address 2a02:26f0:a1:280::71e

$ host www.origin.com
www.origin.com is an alias for ea7.com.edgekey.net.
ea7.com.edgekey.net is an alias for e4894.e12.akamaiedge.net.
e4894.e12.akamaiedge.net has address 2.16.57.118

But multiple other options started spawning around trying to tackle the address exhaustion problem, faster than the deployment of IPv6 is happening. As I already noted above, backend systems, where the end-to-end is under control of a single entity, are perfect soil for IPv6: there’s no need to allocate real IP addresses to these, even when they have to talk over the proper Internet (with proper encryption and access control, goes without saying). So we won’t see more allocations like Xerox’s or Ford’s of whole /8 for backend systems.

$ host www.xerox.com
www.xerox.com is an alias for www.xerox.com.edgekey.net.
www.xerox.com.edgekey.net is an alias for e1142.b.akamaiedge.net.
e1142.b.akamaiedge.net has address 23.214.97.123

$ host www.ford.com
www.ford.com is an alias for www.ford.com.edgekey.net.
www.ford.com.edgekey.net is an alias for e4213.x.akamaiedge.net.
e4213.x.akamaiedge.net has address 104.123.94.235

$ host www.xkcd.com
www.xkcd.com is an alias for xkcd.com.
xkcd.com has address 151.101.0.67
xkcd.com has address 151.101.64.67
xkcd.com has address 151.101.128.67
xkcd.com has address 151.101.192.67
xkcd.com has IPv6 address 2a04:4e42::67
xkcd.com has IPv6 address 2a04:4e42:200::67
xkcd.com has IPv6 address 2a04:4e42:400::67
xkcd.com has IPv6 address 2a04:4e42:600::67
xkcd.com mail is handled by 10 ASPMX.L.GOOGLE.com.
xkcd.com mail is handled by 20 ALT2.ASPMX.L.GOOGLE.com.
xkcd.com mail is handled by 30 ASPMX3.GOOGLEMAIL.com.
xkcd.com mail is handled by 30 ASPMX5.GOOGLEMAIL.com.
xkcd.com mail is handled by 30 ASPMX4.GOOGLEMAIL.com.
xkcd.com mail is handled by 30 ASPMX2.GOOGLEMAIL.com.
xkcd.com mail is handled by 20 ALT1.ASPMX.L.GOOGLE.com.

Another technique that slowed down the exhaustion is SNI. This TLS feature allows to share the same socket for applications having multiple certificates. Similarly to HTTP virtual hosts, that are now what just about everyone uses, SNI allows the same HTTP server instance to deliver secure connections for multiple websites that do not share their certificate. This may sound totally unrelated to IPv6, but before SNI became widely usable (it’s still not supported by very old Android devices, and Windows XP, but both of those are vastly considered irrelevant in 2018), if you needed to provide different certificates, you needed different sockets, and thus different IP addresses. It would not be uncommon for a company to lease a /28 and point it all at the same frontend system just to deliver per-host certificates — one of my old customers did exactly that, until XP became too old to support, after which they declared it so, and migrated all their webapps behind a single IP address with SNI.

Does this mean we should stop caring about the exhaustion? Of course not! But if you are a small(ish) company and you need to focus your efforts to modernize infrastructure, I would not expect you to focus on IPv6 deployment on the frontends. I would rather hope that you’d prioritize TLS (HTTPS) implementation instead, since I would rather not have malware (including but not limited to “coin” miners), to be executed on my computer while I read the news! And that is not simple either.

$ host www.bbc.co.uk
www.bbc.co.uk is an alias for www.bbc.net.uk.
www.bbc.net.uk has address 212.58.246.94
www.bbc.net.uk has address 212.58.244.70

$ host www.theguardian.com  
www.theguardian.com is an alias for guardian.map.fastly.net.
guardian.map.fastly.net has address 151.101.1.111
guardian.map.fastly.net has address 151.101.65.111
guardian.map.fastly.net has address 151.101.129.111
guardian.map.fastly.net has address 151.101.193.111

$ host www.independent.ie
www.independent.ie has address 54.230.14.45
www.independent.ie has address 54.230.14.191
www.independent.ie has address 54.230.14.196
www.independent.ie has address 54.230.14.112
www.independent.ie has address 54.230.14.173
www.independent.ie has address 54.230.14.224
www.independent.ie has address 54.230.14.242
www.independent.ie has address 54.230.14.38

Okay I know these snippets are getting old and probably beating a dead horse. But what I’m trying to bring home here is that there is very little to gain in supporting IPv6 on frontends today, unless you are an enthusiast or a technology company yourself. I work for a company that believes in it and provides tools, data, and its own services over IPv6. But it’s one company. And as a full disclosure, I have no involvement in this particular field whatsoever.

In all of the examples above, which are of course not complete and not statistically meaningful, you can see that there are a few interesting exceptions. In the gaming world, XBox appears to have IPv6 frontends enabled, which is not surprising when you remember that Microsoft even developed one of the first tunnelling protocols to kickstart adoption of IPv6. And of course XKCD, being ran by a technologist and technology enthusiast couldn’t possibly ignore IPv6, but that’s not what the average user needs from their Internet connection.

Of course, your average user spends a lot of time on platforms created and maintained by technology companies, and Facebook is another big player of the IPv6 landscape, so they have been available over it for a long while — though that’s not the case of Twitter. But at the same time, they need their connection to access their bank…

$ host www.chase.com
www.chase.com is an alias for wwwbcchase.gslb.bankone.com.
wwwbcchase.gslb.bankone.com has address 159.53.42.11

$ host www.ulsterbankanytimebanking.ie
www.ulsterbankanytimebanking.ie has address 155.136.22.57

$ host www.barclays.co.uk
www.barclays.co.uk has address 157.83.96.72

$ host www.tescobank.com
www.tescobank.com has address 107.162.133.159

$ host www.metrobank.co.uk
www.metrobank.co.uk has address 94.136.40.82

$ host www.finecobank.com
www.finecobank.com has address 193.193.183.189

$ host www.unicredit.it
www.unicredit.it is an alias for www.unicredit.it-new.gtm.unicreditgroup.eu.
www.unicredit.it-new.gtm.unicreditgroup.eu has address 213.134.65.14

$ host www.aib.ie
www.aib.ie has address 194.69.198.194

to pay their bills…

$ host www.mybills.ie
www.mybills.ie has address 194.125.152.178

$ host www.airtricity.ie
www.airtricity.ie has address 89.185.129.219

$ host www.bordgaisenergy.ie
www.bordgaisenergy.ie has address 212.78.236.235

$ host www.thameswater.co.uk
www.thameswater.co.uk is an alias for aerotwprd.trafficmanager.net.
aerotwprd.trafficmanager.net is an alias for twsecondary.westeurope.cloudapp.azure.com.
twsecondary.westeurope.cloudapp.azure.com has address 52.174.108.182

$ host www.edfenergy.com
www.edfenergy.com has address 162.13.111.217

$ host www.veritasenergia.it
www.veritasenergia.it is an alias for veritasenergia.it.
veritasenergia.it has address 80.86.159.101
veritasenergia.it mail is handled by 10 mail.ascopiave.it.
veritasenergia.it mail is handled by 30 mail3.ascotlc.it.

$ host www.enel.it
www.enel.it is an alias for bdzkx.x.incapdns.net.
bdzkx.x.incapdns.net has address 149.126.74.63

to do shopping…

$ host www.paypal.com
www.paypal.com is an alias for geo.paypal.com.akadns.net.
geo.paypal.com.akadns.net is an alias for hotspot-www.paypal.com.akadns.net.
hotspot-www.paypal.com.akadns.net is an alias for wlb.paypal.com.akadns.net.
wlb.paypal.com.akadns.net is an alias for www.paypal.com.edgekey.net.
www.paypal.com.edgekey.net is an alias for e3694.a.akamaiedge.net.
e3694.a.akamaiedge.net has address 2.19.62.129

$ host www.amazon.com
www.amazon.com is an alias for www.cdn.amazon.com.
www.cdn.amazon.com is an alias for d3ag4hukkh62yn.cloudfront.net.
d3ag4hukkh62yn.cloudfront.net has address 54.230.93.25

$ host www.ebay.com 
www.ebay.com is an alias for slot9428.ebay.com.edgekey.net.
slot9428.ebay.com.edgekey.net is an alias for e9428.b.akamaiedge.net.
e9428.b.akamaiedge.net has address 23.195.141.13

$ host www.marksandspencer.com
www.marksandspencer.com is an alias for prod.mands.com.edgekey.net.
prod.mands.com.edgekey.net is an alias for e2341.x.akamaiedge.net.
e2341.x.akamaiedge.net has address 23.43.77.99

$ host www.tesco.com
www.tesco.com is an alias for www.tesco.com.edgekey.net.
www.tesco.com.edgekey.net is an alias for e2008.x.akamaiedge.net.
e2008.x.akamaiedge.net has address 104.123.91.150

to organize fun with friends…

$ host www.opentable.com
www.opentable.com is an alias for ev-www.opentable.com.edgekey.net.
ev-www.opentable.com.edgekey.net is an alias for e9171.x.akamaiedge.net.
e9171.x.akamaiedge.net has address 84.53.157.26

$ host www.just-eat.co.uk
www.just-eat.co.uk is an alias for 72urm.x.incapdns.net.
72urm.x.incapdns.net has address 149.126.74.216

$ host www.airbnb.com
www.airbnb.com is an alias for cdx.muscache.com.
cdx.muscache.com is an alias for 2-01-57ab-0001.cdx.cedexis.net.
2-01-57ab-0001.cdx.cedexis.net is an alias for evsan.airbnb.com.edgekey.net.
evsan.airbnb.com.edgekey.net is an alias for e864.b.akamaiedge.net.
e864.b.akamaiedge.net has address 173.222.129.25

$ host www.odeon.co.uk
www.odeon.co.uk has address 194.77.82.23

and so on so forth.

This means that for an average user, an IPv6-only network is not feasible at all, and I think the idea that it’s a concept to validate is dangerous.

What it does not mean, is that we should just ignore IPv6 altogether. Instead we should make sure to prioritize it accordingly. We’re in a 2018 in which IoT devices are vastly insecure, so the idea of having a publicly-addressable IP for each of the devices in your home is not just uninteresting, but actively frightening to me. And for the companies that need the adoption, I would hope that the priority right now would be proper security, instead of adding an extra layer that would create more unknowns in their stack (because, and again it’s worth noting, as I had a discussion about this too, it’s not just the network that needs to support IPv6, it’s the full application!). And if that means that non-performance-critical backends are not going to be available over IPv6 this century, so be it.

One remark that I’m sure is going to arrive from at least a part of the readers of this, is that a significant part of the examples I’m giving here appear to all be hosted on Akamai’s content delivery network which, as we can tell from XBox’s website, supports IPv6 frontends. “It’s just a button to press, and you get IPv6, it’s not difficult, they are slackers!” is the follow up I expect. For anyone who has worked in the field long enough, this would be a facepalm.

The fact that your frontend can receive IPv6 connections does not mean that your backends can cope with it. Whether it is for session validation, for fraud detection, or just market analysis, lots of systems need to be able to tell what IP address a connection was coming from. If your backend can’t cope with IPv6 addresses being used, your experience may vary between being unable to buy services and receiving useless security alerts. It’s a full stack world.

IPv6 Horror Story: Telegram

This starts to become an interesting series of blog posts, my ranting about badly implemented IPv6. Some people may suggest that this is because I have a problem with IPv6, but the reality is that I like IPv6 and I want more of it, and properly implemented.

IPv6 enthusiasts often forget that implementing IPv6 is not just a matter of having a routable address, and being able to connect the IPv6 network. Indeed, the network layer is just one of the many layers that need to be updated for your applications to be IPv6-compatible, and that is not even going into the problem of optimizing for IPv6.

Screenshot of Telegram from Android declaring my IP as 10.160.34.127.

To the right you can see the screenshot of Telegram messenger on my personal Android phone – I use Telegram to keep in touch with just a handful of people, of course – once I log in on Telegram from my laptop on the web version.

The text of the message I receive on Android at that login is

Diego Elio,

We detected a login into your account from a new device on 06/08/2017 at 10:46:58 UTC.

Device: Web
Location: Unknown (IP = 10.160.34.127)

If this wasn’t you, you can go to Settings – Privacy and Security – Sessions and terminate that session.

If you think that somebody logged in to your account against your will, you can enable two-step verification in Privacy and Security settings.

Sincerely,
The Telegram Team

This may not sound as obviously wrong to you as it did to me. Never mind the fact that the IP is unknown. 10.160.34.127 is a private network address, as all the IPv4 addresses starting with 10. Of course when I noticed that the first thing I did was checking whether web.telegram.org published a AAAA (IPv6) record, and of course it does.

A quick check using TunnelBear (that does not support IPv6) also shows that the message is correct when logging in from an IPv4 connection instead, showing the public egress IP used by the connection.

I reported this problem to Telegram well over a month ago and it’s still unfixed. I don’t think this count as a huge security gap, as the message still provides some level of information on new login, it does remove the ability to ensure the connection is direct from the egress you expect. Say for instance that your connection is being VPN’d to an adversarial network, you may notice that your connecting IP is appearing as from a different country than you’re in, and you know there’s a problem. When using IPv6, Telegram is removing this validation, because they not only not show you a location, but they give you a clearly wrong IPv4 in place of the actual IPv6.

Since I have no insight of what the Telegram internal infrastructure looks like, I cannot say exactly which part of the pipeline is failing for them. I would expect that this is not a problem with NAT64, because that does not help to receive inbound connections — I find it more likely it’s either the web frontend looking for an IPv4, not finding it, and having some sort of default (maybe to its own IP address) to pass along to the next service, or alternatively the service that is meant to send the message (or possibly the one that tries to geocode the address) that mis-parses the IPv6 into an IPv4, and discard the right address.

Whatever the problem on their side is, it makes a very good example of how IPv6 is not as straightforward to implement as many enthusiasts who have never looked into converting a complicated app stack would think. And documenting this here, is my way to remind people that these things are more complicated than they appear, and that it will take still a significant amount of time to fix all these problems, and until then it’s unreasonable to expect consumers to deal with IPv6-only networks.

Screenshot of Telegram from Android declaring three open sessions for 10.160.34.127.

Update (2017-08-07): a helpful (for once) comment on a reddit thread over this pointed me at the fact that even the Active Sessions page is also showing addresses in the private ten-dot space — as you can see now on the left here. It also shows that my Android session is coming (correctly) from my ISP’s CGNAT endpoint IPv4.

What this tells us is that not only there is a problem with the geocoding of an IP address that Telegram notifies the user on connection, but as long as the connection is proxied over IPv6, the user is not able to tell whether the connection was hijacked or not. If the source IP address and location was important enough to add in the first place, it feels like it should be important enough to get right. Although it may be just wait to be pushed and fixed, as the commenter on Reddit pointed out they could see the right IPv6 addresses nowadays — I can’t.

Also, I can now discard the theory that they may be parsing an IPv6 as if it was IPv4 and mis-rendering it. Of the three connections listed, one is from my office that uses an entirely different IPv6 prefix from the one at my apartment. So there.

The fact that my Android session appears on v4, despite my phone being connected to a working IPv6 network also tells me that the Telegram endpoint for Android (that is, their main endpoint) is not actually compatible with IPv6, which makes it even the more surprising that they would actually publish a record for it for their web endpoint, and show off the amount of missing details in their implementation.

I would also close the update by taking a look at one of the comments from that reddit thread:

In theory, could be that telegram is actually accepting ipv6 on their internet facing port and translating it to v4 inside the network. I’d change my opinion but given the lack of information like the IP address you had on your machine or the range you use for your network, I really don’t see enough to say this is an issue with ipv6 implementation.

Ignore the part where the commenter thinks the fact I did not want to disclose my network setup is any relevant, which appears to imply I’m seeing my own IP space reflected there (I’m not) like it was something that was even possible, and focus on the last part. «I really don’t see enough to say this is an issue with ipv6 implementation.»

This is an issue with Telegram’s IPv6 implementation. Whether they are doing network translation at the edge and then miss a X-Forwarded-For, or one of their systems defaults to a local IP if they can’t parse the remote one (as it’s in a different format), it does not really have to matter. An IPv6 implementation does not stop at the TCP level, it requires changes at application level just fine. I have unfortunately met multiple network people who think that as long as an app connects and serves data, IPv6 is supported — for an user, this is completely bollocks because they want to be able to use the app, use the app correctly, and use the app safely. Right now that does not appear to be the case for Telegram.

IPv6: WordPress has a long way to go, too

I recently complained about Hugo and the fact that it seems like its development was taken over by SEO-types, that changed its defaults to something I’m not comfortable with. In the comments to that post I have let it understood that I’ve been looking into WordPress as an alternative once again.

The reason why I’m looking into WordPress is that I expected it to be a much easier setup, and (assuming I don’t go crazy on the plugins) an easily upgradeable install. Jürgen told me that they now support Markdown, and of course moving to WordPress means I don’t need to keep using Disqus for comments, and I can own my comments again.

The main problem with WordPress, like most PHP apps, is that it requires particular care for it to be set up securely and safely. Luckily I do have some experience with this kind of work, and I thought I might as well share my experience and my setup once I got it running. But here is where things got complicated, to the point I’m not sure if I have any chance of getting this working, so I may have to stick with Hugo for much longer than I was hoping. And almost all of the problems fall back to the issue that my battery of test servers are IPv6-only. But don’t let me get ahead of myself.

After installing, configuring, and getting MySQL, Apache, and PHP-FPM to all work together nicely (which believe me was not obvious), I tried to set up the Akismet plugin, which failed. I ignored that, removed it, and then figured out that there is no setting to enable Markdown at all. Turns out it requires a plugin, which, according again to Jürgen, is the Jetpack plugin from WordPress.com itself.

Unfortunately, I couldn’t get the plugins page to work at all: it would just return an error connecting to WordPress.org. The first problem was that the Plugins page wouldn’t load at all, and a quick tcpdump later told me that WordPress tried connecting to api.wordpress.org. Which despite having eight separate IP addresses to respond from, has no IPv6. Well, that’s okay, I have a TinyProxy running on the host system that I use to fetch distfiles from the “outside world” that is not v6-compatible, so I just need to set this up, right? After all, I was already planning on disallowing direct network access to WordPress, so that’s not a big deal.

Well, the first problem is that the way to set up proxies with WordPress is not documented in the default wp-config.php file. Luckily I found that someone else wrote it down. And that started me on the right direction. Except it was not enough, as the list of plugins and the search page would come up, but they wouldn’t download, with the same error about not being able to establish a (secure) connection to WordPress.org, but only on the Apache error log at first — the page itself would have a debug trace if you ask WP to enable debug reporting.

Quite a bit of debugging later, with tcpdump and editing the source code, I found the problem: some of the requests sent by WordPress target HTTP endpoints, and others (including the downloads, correctly) target HTTPS endpoints. The HTTP endpoints worked fine, but the HTTPS ones failed. And the reason why they failed is that they tried to connect to TinyProxy with TLS. TinyProxy does not support TLS, because it really just performs the minimal amount of work needed of a proxy. And for what it’s worth, in my setup it only allows local connections, so there is no real value in adding TLS to it.

Turns out this bug is only present if PHP does not have curl support, and WordPress fallback to fsockopen. Enabling the curl USE flag for the ebuild was enough to fix the problem, and I reported the bug. I honestly wonder if the Gentoo ebuild should actually force curl on, for WordPress, but I don’t want to go there yet.

By the way, I originally didn’t want to say this on this blog post, but since it effectively went viral, I also found out at that point that the reason why I could get a list of plugins, is that when the connection to api.wordpress.org with HTTPS fails, the code retries explicitly with HTTP. It’s effectively a silent connection downgrade (you’d still find the warning in the log, but nothing would appear like breaking at first). This appears to include the “new version check” of WordPress, which makes it an interesting security issue. I reported it via WordPress Hacker One page before my tweet got viral — and sorry, I didn’t at first realize just how bad that downgrade was.

So now I have an installation of WordPress, mostly secure, able to look for, fetch and install plugins. Let me install JetPack to get that famous Markdown support that is to me a requirement and dealbreaker. For some reason (read: because WordPress is more Open Core than Open Source), it requires activating with a WordPress.com account. That should be easy, yes?

Error Details: The Jetpack server was unable to communicate with your site https:// [OMISSIS] [IXR -32300: transport error: http_request_failed cURL error 7: ]

I hid away the URL for my test server, simply to avoid spammers. The website is public, and it has a valid certificate (thank you Let’s Encrypt), and it is not firewalled or requiring any particular IP to connect to. But, IPv6 only. Which makes it easy for me, as it reduces the amount of scanners and spammers while I try it out, and since I have an IPv6-enabled connection at both home and the office, it makes it easy to test with.

Unfortunately it seems like the WordPress infrastructure not only is not reachable from the IPv6 Internet, but it does not even egress onto it either. Which makes once again the myth of IPv6-only networks infeasible. Contacting WordPress.com on Twitter ended up with them opening a support ticket for me, and a few exchanges and logs later, they confirmed their infrastructure does not support IPv6 and, as they said, «[they] don’t have an estimate on when [they] may».

Where does this leave me? Well, right now I can’t activate the normal Jetpack plugin, but they have a “development version”, which they assert is fully functional for Markdown, and that should let me keep evaluating WordPress without this particular hurdle. Of course this requires more time, and may end up with me hitting other roadblock at this point, I’m not sure. So we’ll see.

Whether I’ll get it to work or not, I will share my configuration files in the near future, because it took me a while to get them set up properly, and some of them are not really explained. You may end up seeing a new restyle of the blog in the next few months. It’s going to bother me a bit, because I usually prefer to kepe the blog the same way for years, instead, but I guess that needs to happen this time around. Also, changing the posts’ paths again means I’ll have to set up another chain of redirects. If I do that, I have a bit of a plan to change the hostname of the blog too.

More on IPv6 feasibility

I didn’t think I would have to go back to the topic of IPv6, particularly after my last rant on the topic. But of course it’s the kind of topic that leads itself to harsh discussions over Twitter, so here I am back again (sigh).

As a possibly usual heads’ up, and to make sure people understand where I’m coming from, it is correct I do not have a network background, and I do not know all the details of IPv6 and the related protocol, but I do know quite a bit about it, have been using it for years, and so my opinion is not one of the lazy sysadmin that sees a task to be done and wants to say there’s no point. Among other things, because I do not like that class of sysadmins any more (I used to). I also seem to have given some people the impression that I am a hater of IPv6. That’s not me, that’s Todd. I have been using IPv6 for a long time, I have IPv6 at home, I set up IPv6 tunnels back in the days of having my own office and contracting out, and I have a number of IPv6-only services (including the Tinderbox.

So with all this on the table, why am I complaining about IPv6 so much? The main reason is that, like I said in the other post, geeks all over the place appear to think that IPv6 is great right now and you can throw away everything else and be done with it right this moment. And I disagree. I think there’s a long way to go, and I also think that this particular attitude will make the way even longer.

I have already covered in the linked post the particular issue of IPv6 having originally be designed for globally identifiable, static IPv6 addresses, and the fact that there have been at least two major RFCs to work around this particular problem. If you have missed that, please go back to the post and read it there because I won’t repeat myself here.

I want instead to focus on why I think IPv6 only is currently infeasible for your average end user, and why NAT (including carrier-grade NAT) is not going away any year now.

First of all, let’s define what an average end user is, because that is often lost to geeks. An average end user does not care what a router does, they barely care what a router is, and a good chunk of them probably still just call them modem, as their only interest is referring to “the device that the ISP gives you to connect to the Internet”. An average user does not care what an IP address is, nor cares how DNS resolution happens. And the reason for all of this is because the end user cares about what they are trying to do. And what they are trying to do is browse the Internet, whether it is the Web as a whole, Facebook, Twitter, YouTube or whatever else. They read and write their mail, they watch Netflix, NowTV and HBO GO. They play games they buy on Steam or EA Origin. They may or may not have a job, and if they do they may or may not care to use a VPN to connect to whatever internal resources they need.

I won’t make any stupid or sexist stereotype example for this, because I have combined my whole family and some of my friends in that definition, and they are all different. They all don’t care about IPv6, IPv4, DSL technologies and the like. They just want an Internet connection, and one that works and is fast. And with “that works” I mean “where they can reach the services they need to complete their task, whichever that task is”.

Right now that is not an IPv6 only network. It may be, in the future, but I won’t hold my breath for a number of reasons, that this is going to happen in the next 10 years, despite the increasing pressure and the increasing growth of IPv6 deployment to end users.

The reason why I say this is that right now, there are plenty of services that can only be reached over IPv4, some of which are “critical” (for some definition of critical of course) to end users, such as Steam. Since the Steam servers are not available over IPv6, the only way you can reach them is either through IPv4 (which will involve some NAT) or NAT64. While the speed of the latter, at least on closed-source proprietary hardware solutions, is getting good enough to be usable, I don’t expect it being widely deployed any time now, as it has the drawback of not working with IP literals. We all hate IP literals, but if you think no company ever issue their VPN instructions with an IP literal in them, you are probably going to be disappointed once you ask around.

There could be an interesting post about this level of “security by obscurity”, but I’ll leave that for later.

No ISP wants to receive calls from their customers that access to a given service is not working for them, particularly when you’re talking about end users that do not want to care about tcpdump and traceroute, and customer support that wouldn’t know how to start debugging that NowTV will send a request to an IPv4 address (literal) before opening the stream, and then continue the streaming over IPv4. Or that Netflix refuse to play any stream if the DNS resolution happens over IPv4 and the stream tries to connect over IPv6.

Which I thought Netflix finally fixed until…

//platform.twitter.com/widgets.js

Now, to be fair, it is true that if you’re using an IPv6 tunnel you are indeed proxying. Before I had DS-Lite at home I was using TunnelBroker and it appeared like I was connecting from Scotland rather than Ireland, and so for a while I unintentionally (but gladly) sidestepped country restrictions. But this also happened a few times on DS-Lite, simply because the GeoIP and WhoIs records didn’t match between the CGNAT and the IPv6 blocks. I can tell you it’s not fun to debug.

The end result is that most customer ISPs will choose to provide a service in such a way that their users feel the least possible inconvenience. Right now that means DS-Lite, which involves a carrier-grade NAT, which is not great, as it is not cheap to run, and it still can cause problems, particularly when users use Torrent or P2P heavily, in which case they can very quickly exhaust the 200-ports forwarding blocks that are allocated for CGNAT. Of course DS-Lite also takes away your public IPv4, which is why I heard a number of geeks complaining loudly about DS-Lite as a deployment option.

Now there is another particular end user, in addition to geeks, that may care about IP addresses: gamers. In particular online gamers (rather than, say, Fallout 4 or Skyrim fans like me). The reason for that is that most of the online games use some level of P2P traffic, and so require you to have a way to receive inbound packets. While it is technically possible to set up IGD-based redirection all the way from the CGNAT address to your PC or console, I do not know how many ISPs implement that correctly. Also, NAT in general introduces risks for latency, and requires more resources on the passing routers, and that is indeed a topic that is close to the heart of gamers. Of course, gamers are not your average end user.

An aside: back in the early days of ADSL in Italy, it was a gaming website, building its own ISP infrastructure, that first introduced Fastpath to the country. Other ISPs did indeed follow, but NGI (the ISP noted above) stayed for a long while a niche ISP focused on the need of gamers over other concerns, including price.

There is one caveat that I have not described yet, but I really should, because it’s one of the first objections I receive every time I speak about the infeasibility of IPv6 only end user connections: the mobile world. T-Mobile in the US, in particular, is known for having deployed IPv6 only 3G/4G mobile networks. There is a big asterisk to put here, though. In the US, and in Italy, and a number of other countries to my knowledge, mobile networks have classically been CGNAT before being v6-only, and with a large amount of filtering in what you can actually connect to, even without considering tethering – this is not always the case for specialised contracts that allow tethering or for “mobile DSL” as they marked it in Italy back in the days – and as such, most of the problems you could face with VPN, v4 literals and other similar limitations of v6-only with NAT64 (or proxies) already applied.

Up to now I have described a number of complexities related to how end users (generalising) don’t care about IPv6. But ISPs do, or they wouldn’t be deploying DS-Lite either. And so do a number of other “actors” in the world. As Thomas pointed out over Twitter, not having to bother with TCP keepalive for making sure a certain connection is being tracked by a router makes mobile devices faster and use less power, as they don’t have to wake up for no reason. Certain ISPs are also facing problems with the scarcity of IPv4 blocks, particularly as they grow. And of course everybody involved in the industry hates pushing around the technical debt of the legacy IPv4 year after year.

So why are we not there yet? In my opinion and experience, it is because the technical debt, albeit massive, is spread around too much: ISPs, application developers, server/service developers, hosting companies, network operators, etc. Very few of them feel enough pain from v4 being around that they want to push hard for IPv6.

A group of companies that did feel a significant amount of that pain organized the World IPv6 Day. In 2011. That’s six years ago. Why was this even needed? The answer is that there were too many unknowns. Because of the way IPv6 is deployed in dual-stack configurations, and the fact that a lot of systems have to deal with addresses, it seems obvious that there is a need to try things out. And while opt-ins are useful, they clearly didn’t stress test enough of the usage surface of end users. Indeed, I stumbled across one such problem back then: when my hosting provider (which was boasting IPv6 readiness) sent me to their bank infrastructure to make a payment, the IP addresses of the two requests didn’t match, and the payment session failed. Interactions are always hard.

A year after the test day, the “Launch” happened, normalizing the idea that services should be available over IPv6. Even though that the case, it took quite a longer while for many services to be normally available over IPv6, and I think, despite being one of the biggest proponents and pushers of IPv6, Microsoft update servers only started providing v6 support by default in the last year or so. Things improved significantly over the past five years, and thanks to the forced push of mobile providers such as T-Mobile, it’s a minority of the connections of my mobile phones that still connect to the v4 world, though there are still enough not to be able to be ignored.

What are the excuse for those? Once upon a time, the answer was “nobody is using IPv6, so we’re not bothering supporting it”. This is getting less and less valid. You can see the Google IPv6 statistics that show an exponential growth of connections coming from IPv6 addresses. My gut feeling is that the wider acceptance of DS-Lite as a bridge solution is driving that – full disclosure: I work for Google, but I have no visibility in that information, so I’m only guessing this out of personal experience and experience gathered before I joined the company – and it’s making that particular excuse pointless.

Unfortunately, there are still “good” excuses. Or at least reasons that is hard to argue with. Sometimes, you cannot enable IPv6 for your web service, even though you have done all your work, because of dependencies that you do not control, for instance external support services such as the payment system in the OVH example above. Sometimes, the problem is to be found in another piece of infrastructure that your service shares with others and that requires to be adapted, as it may have code expecting a valid IPv4 address at some particular place, and an IPv6 would make it choke, say in some log analysis pipeline. Or you may rely on hardware for the network layer that just still does not understand IPv6, and you don’t want to upgrade because you still have not found enough of an upside to you to make the switch.

Or you may be using an hosting provider that insists that giving you a single routable IPv6 is giving you a “/64” (it isn’t — they are giving you a single address in a /64 they control). Any reference to a certain German ISP I had to deal with in the past is not casual at all.

And here is why I think that the debt is currently too spread around. Yes, it is true that mobile phones batteries can be improved thanks to IPv6. But that’s not something your ISP or the random website owner care about – I mean, there are websites so bad that they take megabytes to load a page, that would be even better! – and of course a pure IPv6 without CGNAT is a dream of ISPs all over the world, but it is very unlikely that Steam would care about them.

If we all acted “for the greater good”, we’d all be investing more energy to make sure that v6-only becomes a feasible reality. But in truth, outside of controlled environments, I don’t see that happening any year now as I said. Controlled environments in this case can refer not only to your own personal network, but to situations like T-Mobile’s mobile data network, or an office’s network — after all, it’s unlikely that an office, outside of Sky’s own, would care whether they can connect to NowTV or Steam. Right now, I feel v6-only network (without NAT64 even) are the realm of backend networks. Because you do not need v4 for connecting between backends you control, such as your database or API provider, and if you push your software images over the same backend network, there is no reason why you would even have to hit the public Internet.

I’m not asking to give a pass to anyone who’s not implementing v6 access now, but as I said when commenting on the FOSDEM network, it is not by bothering the end users that you’ll get better v6 support, is by asking the services to be reachable.

To finish off, here’s a few personal musings on the topic, that did not quite fit into the discourse of the post:

  • Some ISPs appear to not have as much IPv4 pressure as others; Telecom Italia still appears to not have reassigned or rerouted the /29 network I used to have routed to my home in Mestre. Indeed, whois information for those IPs still has my old physical address as well as an old (now disconnected) phone number.
  • A number of protocols that provide out-of-band signalling, such as RTSP and RTMP, required changes to be used in IPv6 environments. This means that just rebuilding the applications using them against a C library capable of understanding IPv6 would not be enough.
  • I have read at least one Debian developer in the past giving up on running IPv6 on their web server, because their hosting provider was sometimes unreliable and they had no way to make sure the site was actually correctly reachable at all time; this may sound like a minimal problem, but there is a long tail of websites that are not actually hosted on big service providers.
  • Possibility is not feasibility. Things may be possible, but not really feasible. It’s a subtle distinction but an important one.

The geeks’ wet dreams

As a follow up to my previous rant about FOSDEM I thought I would talk to what I define the “geeks’ wet dream”: IPv6 and public IPv4.

During the whole Twitter storm I had regarding IPv6 and FOSDEM, I said out loud that I think users should not care about IPv6 to begin with, and that IPv4 is not a legacy protocol but a current one. I stand by my description here: you can live your life on the Internet without access to IPv6 but you can’t do that without access to IPv4, at the very least you need a proxy or NAT64 to reach a wide variety of the network.

Having IPv6 everywhere is, for geeks, something important. But at the same time, it’s not really something consumers care about, because of what I just said. Be it for my mother, my brother-in-law or my doctor, having IPv6 access does not give them any new feature they would otherwise not have access to. So while I can also wish for IPv6 to be more readily available, I don’t really have any better excuse for it than making my personal geek life easier by allowing me to SSH to my hosts directly, or being able to access my Transmission UI from somewhere else.

Yes, there is a theoretical advantage for speed and performance, because NAT is not involved, but to be honest for most people that is not what they care about: a 36036 Mbit connection is plenty fast enough even when doing double-NAT, and even Ultra HD, 4K, HDR resolution is well provided by that. You could argue that an even lower latency network may enable more technologies to be available, but that not only sounds to me like a bit of a stretch, it also misses the point once again.

I already liked to Todd’s post, and I don’t need to repeat it here, but if it’s about the technologies that can be enabled, it should be something the service providers should care about. Which by the way is what is happening already: indeed the IPv6 forerunners are the big player companies that effectively are looking for a way to enable better technology. But at the same time, a number of other plans were executed so that better performance can be gained without gating it to the usage of a new protocol that, as we said, really brings nothing to the table.

Indeed, if you look at protocols such as QUIC or HTTP/2, you can notice how these reduce the amount of ports that need to be opened, and that has a lot to do with the double-NAT scenario that is more and more common in homes. Right now I’m technically writing under a three-layers NAT: the carrier-grade NAT used by my ISP for deploying DS-Lite, the NAT issued by my provider-supplied router, and the last NAT which is set up by my own router, running LEDE. I don’t even have a working IPv6 right now for various reasons, and know what? The bottleneck is not the NATs but rather the WiFi.

As I said before, I’m not doing a full Todd and thinking that ignoring IPv6 is a good idea, or that we should not spend any time fixing things that break with it. I just think that we should get a sense of proportion and figuring out what the relative importance of IPv6 is in this world. As I said in the other post, there are plenty of newcomers that do not need to be told to screw themselves if their systems don’t support IPv6.

And honestly the most likely scenario to test for is a dual-stack network in which some of the applications or services don’t work correctly because they misunderstand the system. Like OVH did. So I would have kept the default network dual-stack, and provided a single-stack, NAT64 network as a “perk”, for those who actually do care to test and improve apps and services — and possibly also have a clue not to go and ping a years old bug that was fixed but not fully released.

But there are more reasons why I call these dreams. A lot of the reasoning behind IPv6 appears to be grounded into the idea that geeks want something, and that has to be good for everybody even when they don’t know about it: IPv6, publicly-routed IPv4, static addresses and unfiltered network access. But that is not always the case.

Indeed, if you look even just at IPv6 addressing, and in particular to how stateless addressing works, you can see how there has been at least three different choices at different times for generating them:

  • Modified EUI-64 was the original addressing option, and for a while the only one supported; it uses the MAC address of the network card that receives the IPv6, and that is quite the privacy risk, as it means you can extract an unique identifier of a given machine and identify every single request coming from said machine even when it moves around different IPv6 prefixes.
  • RFC4941 privacy extensions were introduced to address that point. These are usually enabled at some point, but these are not stable: Wikipedia correctly calls these temporary addresses, and are usually fine to provide unique identifiers when connecting to an external service. This makes passive detection of the same machine across network not possible — actually, it makes it impossible to detect a given machine even in the same network, because the address changes over time. This is good on one side, but it means that you do need session cookies to maintain login section active, as you can’t (and you shouldn’t) rely on the persistence of an IPv6 address. It also allows active detection, at least of the presence of a given host within a network, as it does not by default disable the EUI-64 address, just not use it to connect to services.
  • RFC7217 adds another alternative for address selection: it provides a stable-within-a-network address, making it possible to keep long-running connections alive, and at the same time ensuring that at least a simple active probing does not give up the presence of a known machine in the network. For more details, refer to Lubomir Rintel’s post as he went in more details than me on the topic.

Those of you fastest on the uptake will probably notice the connection with all these problems: it all starts by designing the addressing assuming that the most important addressing option is stable and predictable. Which makes perfect sense for servers, and for the geeks who want to run their own home server. But for the average person, these are all additional risks that do not provide any desired additional feature!

There is one more extension to this: static IPv4 addresses suffer from the same problem. If your connection is always coming from the same IPv4 address, it does not matter how private your browser may be, your connections will be very easy to track across servers — passively, even. What is the remaining advantage of a static IP address? Certainly not authentication, as in 2017 you can’t claim ignorance of source address spoofing.

And by the way this is the same reason why providers started issuing dynamic IPv6 prefixes: you don’t want a household (if not a person strictly) to be tied to the same address forever, otherwise passive tracking is effectively a no-op. And yes, this is a pain for the geek in me, but it makes sense.

Static, publicly-routable IP addresses make accessing services running at home much easier, but at the same time puts you at risk. We all have been making about the “Internet of Things”, but at the same time it appears everybody wants to be able to set their own devices to be accessible from the outside, somehow. Even when that is likely the most obvious way for external attackers to access one’s unprotected internal network.

There are of course ways around this that do not require such a publicly routable address, and they are usually more secure. On the other hand, they are not quite a panacea of course. Indeed they effectively require a central call-back server to exist and be accessible, and usually that is tied to a single company, with customise protocols. As far as I know, no open-source such call-back system appears to exist, and that still surprises me.

Conclusions? IPv6, just like static and publicly routable IP addresses, are interesting tools that are very useful to technologists, and it is true that if you consider the original intentions behind the Internet these are pretty basic necessities, but if you think that the world, and thus requirements and importance of features, have not changed, then I’m afraid you may be out of touch.

FOSDEM and the dangerous IPv6-only network (or how to lose the trust of newcomers)

Last year I posted about FOSDEM and the IPv6-only network as a technical solution with no problem: nobody should be running IPv6-only consumer networks, because there is zero advantage to them and lots of disadvantages. This year, despite me being in California and missing FOSDEM, their IPv6-only experiment expanded into a full blown security incident (Archive.is), and I heard about it over Twitter and Facebook.

I have criticized this entrenched decision of providing a default IPv6-only network, and last night (at the time of writing) I ended up on a full-blown Twitter rage against IPv6 in general. The FOSDEM twitter handler stepped in defending their choice, possibly not even reading correctly my article or understanding that the same @flameeyes they have been replying to is the same owner of https://flameeyes.blog/, but that’s possibly beside the point:

//platform.twitter.com/widgets.js

Let me try to be clear: I do know that the dual-stack network is available. Last year it was FOSDEM-legacy, and this year is FOSDEM-ancient. How many people do you expect to connect to a network that is called ancient? Would you really expect that the ancient network is the only one running the dual-stack routing, rather than, say, a compatibility mode 2.4GHz 802.11b? Let’s get back to that later.

What did actually happen, since the FOSDEM page earlier doesn’t make it too clear: somebody decided that FOSDEM is just as good a place as BlackHat, DEFCON or Chaos Computer Congress to run a malicious hotspot on. So they decided to run a network called “FOSDEM FreeWifi by Google”, with a captive portal asking for your Google account address and password. It was clearly a a low-passion effort, as I noticed from the screenshots over twitter, and by what an unnamed source told me:

  • the login screen looked almost original, but asked for both username and password on the same form, Google never does that;
  • the page was obviously insecure;
  • the page was served over 10.0.0.0/8 network.

But while these are clearly signs of phishing for a tech user, and would report “Non Secure” on modern Chrome and Firefox, that does not mean they wouldn’t get a non-expert user. Of course the obvious answer of what I will from now on refer to as geek supremacists is that it’s their own fault if they get owned. Which is effectively what FOSDEM said above, paraphrasing: we know nothing of what happened on that network, go and follow Google’s tips on Gmail security.

Well, let me at least point out to go and grab yourself a FIDO key because it would save your skin in cases like that.

But here is a possible way this can fall short of a nice conference experience: there’s a new person interested in Free Software, who has started using Linux or some other FLOSS software and decided to come to what is ostensibly the biggest FLOSS conference in Europe, and probably still the biggest free (as in gratis) open source conference in the world. They are new to this, new to tech, rather than just Linux, and “OpSec” is an unknown term to them.

They arrive at FOSDEM and they try to connect to the default network with their device, which connects and can browse the Internet, but for some reason half the apps don’t work. They ignored the “ancient” network, because their device is clearly not ancient – whether they missed the communication about what it was, or it used the term dual-stack that they had no understanding of – but they see this Google network, let’s do that, even though it requires login… and now someone has their password.

Now the person or people who have their password may be ethical, and contact HIBP to provide a dump of usernames involved and notify them that their passwords were exposed, but probably they won’t. With hope, they won’t use those passwords for anything nefarious either, but at the same time, there is no guarantee that the FreeWifi people are the only ones having a copy of those passwords, because the first unethical person who noticed this phishing going on would have started a WiFi capture to get the clear-text usernames and passwords, with the certainty that if they would use these, the FreeWifi operators would be the ones taking them blame, oops.

Did I say that all the FOSDEM networks are unencrypted? At least 33c3 tried providing an anonymous 802.1x protected/encrypted connection. But of course for the geek supremacists, it’s your fault if you use anything unencrypted and not use a VPN when connecting to public networks. Go and pay the price of not being a geek!

So let’s go back to our new enthusiastic person. If something does happen to the account, it get compromised, or whatever else, the reaction the operators are expecting is probably one of awe: “Oh! They owned me good! I should learn how not to fall for this again!” — except it is quite more likely that the reaction is going to be of distrust “What jerks! Why did I even go there? No kidding nobody uses Linux.” And now we would have alienated one more person that could have become an open source contributor.

Now I have no doubt that FOSDEM organizers didn’t intend for this malicious network to be set up. But at the same time, they allowed it to happen by being too full of themselves, that by making it difficult to users to use the network, they lead to improvement in apps that would otherwise not support IPv6. That’s what they said on twitter: “we are trying to annoy people”. Great, bug fixes via annoyance I’m sure they work great, in a world of network services, that are not in control of the people using them, even for open source projects! And it sure worked great with the Android bug that, fixed almost a year before, kept receiving “me too” and “why don’t you fix this right now?” because most vendors have not released a new version in time for the following FOSDEM (and now an extra year later, many have not moved on from Android 5 either).

Oh and by the way, the reason why it’s called “ancient” is also to annoy people and force them to re-connect to the non-default network. Because calling it FOSDEM-legacy2017 would have been too friendly and would make less of a statement than “ancient”: look at you, you peasant using an ancient network instead of being the coolest geek on the planet and relying on IPv6!

So yes, if something malicious were to happen, I would blame the FOSDEM organizers for allowing that to happen, and for not even providing a “mea culpa” and admitting that maybe they are stressing this point a bit too much.

To close it off, since I do not want to spend too much time on this post on the technical analysis of IPv6 (I did that last year), I would leave you to Todd Underwood’s words and yes, that is an 11 years old post, which I find still relevant. I’m not quite on the same page as Todd, given how I try hard to use IPv6 and use it for backend servers, but his point, if hyperbolic, should be taken into consideration.

In the land of dynamic DNS

In the previous post I talked about my home network and services, and I pointed out how I ended up writing some code while trying to work around lack of pure-IPv6 hosts in Let’s Encrypt. This is something I did post about on Google+, but that’s not really a reliable place to post this for future reference, so it’s time to write it down here.

In the post I referred to the fact that, up until around April this year, Let’s Encrypt did not support IPv6-only hosts, and since I only have DS-Lite connectivity at home, I wanted that. To be precise, it’s the http authentication that was not supported on IPv6, but the ACME protocol (which Let’s Encrypt designed and implements) supports another authentication method: dns-01, at least as a draft.

Since this is a DNS-based challenge, there is no involvement of IPv4 or IPv6 addresses altogether. Unfortunately, the original client, now called certbot, does not support this type of validation, among other things because it’s bloody complicated. On the bright side, lego (an alternative client written in Go), does support this validation, including a number of DNS provider code supported.

Unfortunately, Afraid.org, which is the dynamic host provider I started using for my home network, is not supported. The main reason is that its API do not allow creating TXT or CNAME records, which are needed for dns-01 validation. I did contact the Afraid.org owner hoping that a non-documented API was found, but I got no answer back.

Gandi, on the other hand, is supported, and is my main DNS provider, so I started looking into that direction. Unlike my previous provider (OVH), Gandi does not appear to provide you any support for delegating to a dynamic host system. So instead I looked for options around it, and I found that Gandi provides some APIs (which, after all, is what lego uses itself.)

I ended up writing two DNS updating tools for that; if nothing else because they are very similar, one for Gandi and one for Afraid.org (the one for Afraid.org was what I started with — at the time I thought that they didn’t have an endpoint to update IPv6 hosts, since the default endpoint was v4 only.) I got clearance to publish it and is now on GitHub, it can work as a framework for any other dynamic host provider, if you feel like writing one, it provides some basic helper methods to figure out the current IPv4 or IPv6 assigned to an interface — while this makes no sense behind NAT, it makes sense with DS-Lite.

But once I got it all up and running I realized something that should have been obvious from the start: Gandi’s API is not great for this use case at all. In the case of Afraid.org and OVH’s protocol, there is a per-host token, usually randomly generated, you deploy that to the host you want to keep up to date, and that’s it, nothing else can be done with that token: it’s a one-way update of the host.

Gandi’s API is designed to be an all-around provisioning API, so it allows executing any operation whatsoever with your token. Including registering or dropping domains, or dropping the whole zone or reconfiguring it. It’s a super-user access token. And it sidesteps the 2-factors authentication that you can set up on the Gandi. If you lose track of this API key, it’s game over.

So at the end of the day, I decided not to use this at all. But since I already wrote the tools, I thought it would be a good idea to leave it to the world. It was also a bit of a nice way for me to start writing some public Go code,

Running services on Virgin Media (Ireland)

Update: just a week after I wrote this down (and barely after I managed to post this), Virgin Media turned off IPv6-PD on the Hub 3.0. I’m now following up with them and currently without working IPv6 (and no public IPv4) at home, which sucks.

I have not spoken much about my network setup since I moved to Dublin, mostly because there isn’t much to speak, although admittedly there are a few interesting things that are quite different from before.

The major one is that my provider (Virgin Media Ireland, formerly known as UPC Ireland) supports native IPv6 connectivity through DS-Lite. For those who are not experts of IPv6 deployments, this means that the network has native IPv6 but loses the public IPv4 addressing: the modem/router gets instead a network-local IPv4 address (usually in the RFC1918 or RFC6598 ranges), and one or more IPv6 prefix delegations from which it provides connectivity to the local network.

This means you lose the ability to port-forward a public IPv4 address to a local host, which many P2P users would be unhappy about, as well as having to deal with one more level of NAT (and that almost always involves rate limiting by the provider on the number of ports that can be opened simultaneously.) On the other hand, it gives you direct, native access to the IPv6 network without taking away (outbound) access to the legacy, IPv4 network, in a much more user-friendly way than useless IPv6-only networks that rely on NAT64. But it also brings a few other challenges with it.

Myself, I actually asked to be opted in the DS-Lite trial when it was still not mandatory. The reason for me is that I don’t really use P2P that much (although a couple of times it was simpler to find a “pirate” copy of a DVD I already own, rather than trying to rip it to watch it now that I have effectively no DVD reader), and so I have very few reasons to need a public IPv4 address. On the other hand, I do have a number of backend-only servers that are only configured over the IPv6 network, and so having native access to the network is preferable. On the other hand I do have sometimes the need to SSH into a local box or HTTP over Transmission or similar software.

Anyway back on my home network, I have a Buffalo router running OpenWRT behind the so-called Virgin Media Hub (which is also the cable modem — and no it’s not more convenient to just get a modem-mode device, because this is EuroDOCSIS which is different from the US version, and Virgin Media does not support it.) And yes this means that IPv4 is actually triple-natted! This device is configured to get a IPv6 prefix delegation from the Hub, and uses that for the local network, as well as the IPv4 NAT.

Note: for this to work your Hub needs to be configured to have DHCPv6 enabled, which may or may not do so by default (mine was enabled, but then a “factory restore” disabled it!) To do so, go to the Hub admin page, login, and under Advanced, DHCP make sure that IPv6 is set to Stateful. That’s it.

There are two main problems that needs to be solved to be able to provide external access to a webapp running on the local server: dynamic addressing and firewalls. These two issues are more intertwined than I would like, making it difficult to explain the solution step by step, so let me first present the problems.

On the machine that needs to serve the web app, the first problem to solve is making sure that it gets at least one stable IPv6 address that can be reached from the outside. It used to be very simple, because except for IPv6 privacy extensions the IPv6 address was stable and calculated based on prefix and hardware (MAC) address. Unfortunately this is not the the case now; RFC7217 provides “Privacy stable addressing”, and NetworkManager implements it. In a relatively-normal situation, these addresses are by all means stable, and you could use them just fine. Except there is a second dynamic issue at hand, at least with my provider: the prefix is not stable, both for the one assigned to the Hub, and for that then delegated to the Buffalo router. Which means the network address that the device gets is closer to random than stable.

While this first part is relatively easy to fix by using one a service that allows you to dynamically update a host name, and indeed this is part of my setup too (I use Afraid.org), it does not solve the next problem, which is to open the firewall to let the connections in. Indeed, firewalls are particularly important on IPv6 networks, where every device would otherwise be connected and visible to the public. Unfortunately unless you connect directly to the Hub, there is no way to tell the device to only allow a given device, no matter what the prefix assigned is. So I started by disabling the IPv6 firewall (since no device is connected to the Hub directly beside the OpenWRT), and rely exclusively on the OpenWRT-provided firewall. This is the first level passed. There is one more.

Since the prefix that the OpenWRT receives as delegation keeps changing, it’s not possible to just state the IPv6 you want to allow access to in the firewall config, as it’ll change every time the prefix changes, even without the privacy mode enabled. But there is a solution: when using stable, not privacy enabled addresses, the suffix of the address is stable, and you can bet that someone already added support in ip6tables to match against a suffix. Unfortunately the OpenWRT UI does not let you set it up, but you can do that from the config file itself.

On the target host, which I’m assuming is using NetworkManager (because if not, you can just let it use the default address and not have to do anything), you have to set this one property:

# nmcli connection show
[take note of the UUID shown in the list]
# nmcli connection modify ${uuid} +ipv6.addr-gen-mode eui64

This re-enables EUI-64 based addressing for IPv6, which is based off the mac address of the card. It’ll change the address (and will require reconfiguration in OpenWRT, too) if you change the network card or its MAC address. But it does the job for me.

From the OpenWRT UI, as I said, there is no way to set the right rule. But you can configure it just fine in the firewall configuration file, /etc/config/firewall:

config rule
        option enabled '1'
        option target 'ACCEPT'
        option name 'My service'
        option family 'ipv6'
        option src 'wan'
        option dest 'lan'
        option dest_ip '::0123:45ff:fe67:89AB/::ffff:ffff:ffff:ffff'

You have to replace ::0123:45ff:fe67:89AB with the correct EUI64 suffix, which includes splicing in ff:fe and flipping one bit. I never remember how to calculate it so I just copy-paste it from the machine as I need it. This should give you a way to punch through all the firealls and get you remote access.

What remains to be solved at this point is having a stable way to contact the service. This is usually easy, as dynamic DNS hosts have existed for over twenty years by now, and indeed the now-notorious (for being at the receiving end of one of the biggest DDoS attacks just a few days ago) Dyn built up their fame. Unfortunately, they appear to have actually vastly dropped the ball when it comes to dynamic DNS hosting, as I couldn’t convince them (at least at the time) to let me update a host with only IPv6. This might be more of a problem with the clients than the service, but it’s still the same. So, as I noted earlier, I ended up using Afraid.org, although it took me a while where to find the right way to update a v6-only host: the default curl command you can find is actually for IPv4 hosts.

Oh yeah there was a last one remaining problem with this, at least when I started looking into fixing this all up: at the time Let’s Encrypt did not support IPv6-only hosts, when it came to validating domains with HTTP requests, so I spent a few weeks fighting and writing tools trying to find a decent way to have a hostname that is both dynamic and allows for DNS-based domain control validation for ACME. I will write about that separately, since it takes us on a tangent that has nothing to do with the actual Virgin Media side of things.

FOSDEM and the unrealistic IPv6-only network

Most of you know FOSDEM already, for those who don’t, it’s the largest Free and Open Source Software focused conference in Europe (if not the world.) If you haven’t been to it I definitely suggest it, particularly because it’s a free admission conference and it always has something interesting to discuss.

Even though there is no ticket and no badge, the conference does have free WiFi Internet access, which is how the number of attendees is usually estimated. In the past few years, their network has also been pushing the envelope on IPv6 support, first providing a dualstack network when IPv6 was fairly rare, and in the recent (three?) years providing an IPv6-only network as the default.

I can see the reason to do this, in the sense that a lot of Free Software developers are physically at the conference, which means they can see their tools suffer in an IPv6 environment and fix them. But at the same time, this has generated lots of complaints about Android not working in this setup. While part of that noise was useful, I got the impression this year that the complaints are repeated only for the sake of complaining.

Full disclosure, of course: I do happen to work for the company behind Android. On the other hand, I don’t work on anything related at all. So this post is as usual my own personal opinion.

The complaints about Android started off quite healthy: devices couldn’t actually connect to an IPv6 dual-stack network, and then they couldn’t connect to a IPv6-only network. Both are valid complaints to begin with, though there is a bit more to it. This year in particular the complaints were not so healthy because current versions of Android (6.0) actually do support IPv6-only networks, though most of the Android devices out there are not running this version, either because they have too old hardware or because the manufacturer has not released a new build yet.

What does tick me though has really nothing to do with Android, but rather with the idea that people have that the current IPv6-only setup used by FOSDEM is a realistic approach to IPv6 networking — it really is not. It is a nice setup to test things out and stress the need for proper support for IPv6 in tools, but it’s very unlikely to be used in production by anybody as is.

The technique used (at least this year) by FOSDEM is NAT64. To oversimplify how this works, it is designed to modify the DNS replies when resolving hostnames so that they always provide an IPv6 address, even though they would only have A records (IPv4 addresses). The IPv6 addresses used would then map back to IPv4, and the edge router would then “translate” between the two connections.

Unlike classic NAT, this technique requires user-space components, as the kernel uses separate stacks for IPv4 and IPv6 which do not allow direct message passing between the two. This makes it complicated and significantly slower (you have to copy the data from kernel to userspace and back all the time), unless you use one of the hardware router that are designed to deal with this (I know both Juniper and Cisco have those.)

NAT64 is a very useful testbed, if your target is figuring out what in your stack is not ready for IPv6. It is not, though, a realistic approach for consumer networks. If your client application does not have IPv6 support, it’ll just fail to connect. If for whatever reason you rely on IPv4 literals, they won’t work. Even worse, if the code allows a connection to be established over IPv6, but relies on IPv4 semantics for things like logging, or (worse) access control, then you now have bugs, crashes or worse, vulnerabilities.

And while fuzzing and stress-testing are great for development environments, they are not good for final users. In the same way -Werror is a great tool to fix your code, but uselessly disrupts your users.

In a similar fashion, while IPv6-only datacenters are not that uncommon – Facebook (the company) talked about them two years ago already – they serve a definite different purpose from a customer network. You don’t want, after all, your database cluster to connect to random external services that you don’t control — and if you do control the services, you just need to make sure they are all available over IPv6. In such a system, having a single stack to worry about simplifies, rather than complicate, things. I do something similar for the server I divide into containers: some of them, that are only backends, get no IPv4 at all, not even in NAT. If they ever have to go fetch something to build on the Internet at large, they go through a proxy instead.

I’m not saying that FOSDEM setting up such a network is not useful. It actually hugely is, as it clearly highlights the problems of applications not supporting IPv6 properly. And for Free Software developers setting up a network like this might indeed be too expensive in time or money, so it is a chance to try things out and iron out bugs. But at the same time it does not reflect a realistic environment. Which is why adding more and more rant on the tracking Android bug (which I’m not even going to link here) is not going to be useful — the limitation was known for a while and has been addressed on newer versions, but it would be useless to try backporting it.

For what it’s worth, what is more likely to happen as IPv6 adoption needs to happen, is that providers will move towards solutions like DS-Lite (nothing to do with Nintendo), which couples native IPv6 with carrier-grade NAT. While this has limitations, depending on the size of the ISP pools, it is still easier to set up than NAT64, and is essentially transparent for customers if their systems don’t support IPv6 at all. My ISP here in Ireland (Virgin Media) already has such a setup.

New devbox running

I announced it in February that Excelsior, which ran the Tinderbox, was no longer at Hurricane Electric. I have also said I’ll start on working on a new generation Tinderbox, and to do that I need a new devbox, as the only three Gentoo systems I have at home are the laptops and my HTPC, not exactly hardware to run compilation all the freaking time.

So after thinking of options, I decided that it was much cheaper to just rent a single dedicated server, rather than a full cabinet, and after asking around for options I settled for Online.net, because of price and recommendation from friends. Unfortunately they do not support Gentoo as an operating system, which makes a few things a bit more complicated. They do provide you with a rescue system, based on Ubuntu, which is enough to do the install, but not everything is easy that way either.

Luckily, most of the configuration (but not all) was stored in Puppet — so I only had to rename the hosts there, changed the MAC addresses for the LAN and WAN interfaces (I use static naming of the interfaces as lan0 and wan0, which makes many other pieces of configuration much easier to deal with), changed the IP addresses, and so on. Unfortunately since I didn’t start setting up that machine through Puppet, it also meant that it did not carry all the information to replicate the system, so it required some iteration and fixing of the configuration. This also means that the next move is going to be easier.

The biggest problem has been setting up correctly the MDRAID partitions, because of GRUB2: if you didn’t know, grub2 has an automagic dependency on mdadm — if you don’t install it it won’t be able to install itself on a RAID device, even though it can detect it; the maintainer refused to add an USE flag for it, so you have to know about it.

Given what can and cannot be autodetected by the kernel, I had to fight a little more than usual and just gave up and rebuilt the two (/boot and / — yes laugh at me but when I installed Excelsior it was the only way to get GRUB2 not to throw up) arrays as metadata 0.90. But the problem was being able to tell what the boot up errors were, as I have no physical access to the device of course.

The Online.net server I rented is a Dell server, that comes with iDRAC for remote management (Dell’s own name for IPMI, essentially), and Online.net allows you to set up connections to through your browser, which is pretty neat — they use a pool of temporary IP addresses and they only authorize your own IP address to connect to them. On the other hand, they do not change the default certificates, which means you end up with the same untrustable Dell certificate every time.

From the iDRAC console you can’t do much, but you can start up the remove, JavaWS-based console, which reminded me of something. Unfortunately the JNLP file that you can download from iDRAC did not work on either Sun, Oracle or IcedTea JREs, segfaulting (no kidding) with an X.509 error log as last output — I seriously thought the problem was with the certificates until I decided to dig deeper and found this set of entries in the JNLP file:

 <resources os="Windows" arch="x86">
   <nativelib href="https://idracip/software/avctKVMIOWin32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin32.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="amd64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
 <resources os="Windows" arch="x86_64">
   <nativelib href="https://idracip/software/avctKVMIOWin64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLWin64.jar" download="eager"/>
 </resources>
  <resources os="Linux" arch="x86">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i386">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i586">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="i686">
    <nativelib href="https://idracip/software/avctKVMIOLinux32.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux32.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="amd64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Linux" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOLinux64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLLinux64.jar" download="eager"/>
  </resources>
  <resources os="Mac OS X" arch="x86_64">
    <nativelib href="https://idracip/software/avctKVMIOMac64.jar" download="eager"/>
   <nativelib href="https://idracip/software/avctVMAPI_DLLMac64.jar" download="eager"/>
  </resources>

Turns out if you remove everything but the Linux/x86_64 option, it does fetch the right jar and execute the right code without segfaulting. Mysteries of Java Web Start I guess.

So after finally getting the system to boot, the next step is setting up networking — as I said I used Puppet to set up the addresses and everything, so I had working IPv4 at boot, but I had to fight a little longer to get IPv6 working. Indeed IPv6 configuration with servers, virtual and dedicated alike, is very much an unsolved problem. Not because there is no solution, but mostly because there are too many solutions — essentially every single hosting provider I ever used had a different way to set up IPv6 (including none at all in one case, so the only option was a tunnel) so it takes some fiddling around to set it up correctly.

To be honest, Online.net has a better set up than OVH or Hetzner, the latter being very flaky, and a more self-service one that Hurricane, which was very flexible, making it very easy to set up, but at the same time required me to just mail them if I wanted to make changes. They document for dibbler, as they rely on DHCPv6 with DUID for delegation — they give you a single /56 v6 net that you can then split up in subnets and delegate independently.

What DHCPv6 in this configuration does not give you is routing — which kinda make sense, as you can use RA (Route Advertisement) for it. Unfortunately at first I could not get it to work. Turns out that, since I use subnets for the containerized network, I enabled IPv6 forwarding, through Puppet of course. Turns out that Linux will ignore Route Advertisement packets when forwarding IPv6 unless you ask it nicely to — by setting accept_ra=2 as well. Yey!

Again this is the kind of problems that finding this information took much longer than it should have been; Linux does not really tell you that it’s ignoring RA packets, and it is by far not obvious that setting one sysctl will disable another — unless you go and look for it.

Luckily this was the last problem I had, after that the server was set up fine and I just had to finish configuring the domain’s zone file, and the reverse DNS and the SPF records… yes this is all the kind of trouble you go through if you don’t just run your whole infrastructure, or use fully cloud — which is why I don’t consider self-hosting a general solution.

What remained is just bits and pieces. The first was me realizing that Puppet does not remove the entries from /etc/fstab by default, so I noticed that the Gentoo default /etc/fstab file still contains the entries for CD-ROM drives as well as /dev/fd0. I don’t remember which was the last computer with a floppy disk drive that I used, let alone owned.

The other fun bit has been setting up the containers themselves — similarly to the server itself, they are set up with Puppet. Since the server used to be running a tinderbox, it used to also host a proper rsync mirror, it was just easier, but I didn’t want to repeat that here, and since I was unable to find a good mirror through mirrorselect (longer story), I configured Puppet to just provide to all the containers with distfiles.gentoo.org as their sync server, which did not work. Turns out that our default mirror address does not have any IPv6 hosts on it ­– when I asked Robin about it, it seems like we just don’t have any IPv6-hosted mirror that can handle that traffic, it is sad.

So anyway, I now have a new devbox and I’m trying to set up the rest of my repositories and access (I have not set up access to Gentoo’s repositories yet which is kind of the point here.) Hopefully this will also lead to more technical blogging in the next few weeks as I’m cutting down on the overwork to relax a bit.