Oh Gosh, Trying to Find a New Email Provider in 2020

In the year 2020, I decided to move out of my GSuite account (née Google Apps for Business), which allowed me to use Gmail for my personal domain, and that I have used for the past ten years or so. It’s not that I have a problem with Gmail (I worked nearly seven years for Google now, why would it be a problem?) or that I think the service is not up to scratch (as this experience is proving me, I’d argue that it’s still the best service you can rely upon for small and medium businesses — which is the area I focused on when I ran my own company). It’s just that I’m not a business and the features that GSuite provides over the free Gmail no longer make up for the services I’m missing.

But I still wanted to be able to use my own domain for my mail, rather than going back to the standard Gmail domain. So I decided to look around, and migrate my mail to another paid, reliable, and possibly better solution. Alas, the results after a week of looking and playing around are not particularly impressive to me.

First of all I discarded, without even looking at it, the option of self-hosting my mail. I don’t have the time, nor the experience, nor the will to have to deal with my own email hosting. It’s a landmine of issues and risks and I don’t intend to accept them. So if you’re about to suggest this, feel free to not comment. I’m not going to entertain those suggestions anyway.

I ended up looking at what people have been suggesting on Twitter a few times and evaluated two options: ProtonMail and FastMail. I ended up finding both lacking. And I think I’m a bit more upset with the former than the latter, for reasons I’ll get to in this (much longer than usual) blog post.

My requirements for a replacement solution were to have a reliable webmail interface, with desktop notifications. A working Android app. And security at login. I was not particularly interested in ProtonMail’s encrypt-and-sign everything approach, but I could live with that. But I wanted something that wouldn’t risk letting everyone in with just a password, so 2FA was a must for me. I was also hoping to find something that would make it easy to deal with git send-email, but I ended up accepting right away that nothing would be anywhere close to the solution that we found with Gmail and GSuite (more on that later.)

Bad 2FA Options For All

So I started by looking at the 2nd Factor Authentication options for the two providers. Google being the earliest adopter of the U2F standard means of course that this is what I’ve been using, and would love to keep using once I replace it. But of the two providers I was considering, only FastMail stated explicitly it supported U2F. I was told that ProtonMail expects to add support for it this year, but I couldn’t even tell that from their website.

So I tried first FastMail, which has a 30 days free trial. To set up the U2F device, you need to provide a phone number as a recovery option — which gets used for SMS OTP. I don’t like SMS OTP because it’s not really secure (in some countries taking over a phone number is easier than taking over an email address), and because it’s not reliable the moment you don’t have mobile network services. It’s easy to mistake the “no access to mobile network” with “no access to Internet” and say that it doesn’t really matter, but there are plenty of places where I would be able to reach the Internet and not receive SMS: planes, tube platforms, the office when I arrived in London, …

But surely U2F is enough, why am I even bothering complaining about SMS OTP, given that you can disable it once the U2F security key is added? Well, turns out that when I tried to login on the Android app, I was just sent an SMS with the OTP to log myself in. Indeed, after I removed the phone number backup option, the Android app threw me a lovely error of «U2F is your only two-step verification method, but this is not supported here.» On Android, which can act as an U2F token.

As I found out afterwards, you can add a TOTP app as well, which solves the issue of logging in on Android without mobile network service, but by that point I had already started looking at ProtonMail, because it was not the best first impression to start with.

ProtonMail and the Bridge of Destiny

ProtonMail does not provide standard IMAP/SMTP access, because encryption (that’s the best reason I can get from the documentation, I’m not sure at all what this was all about, but honestly, that’s as far as I care to look into it). If you want to use a “normal” mail agent like Thunderbird, you need to use a software, accessible to paying customers only, that acts as “bridge”. As far as I can tell after using it, it appears to be mostly a way to handle the authentication rather than the encryption per se. Indeed, you log into the Bridge software with username, password and OTP, and then it provides localhost-only endpoints for IMAP4 and SMTP, with a generated local password. Neat.

Except it’s only available in Beta for Linux, so instead I ended up running it on Windows at first.

This is an interesting approach. Gmail implemented, many years ago, a new extension to IMAP (and SMTP) that allows using OAuth 2 for IMAP logins. This effectively delegates the login action to a browser, rather than executing it inline in the protocol, and as such it allows to request OTPs, or even supporting U2F. Thunderbird on Windows does work very well with this and even supports U2F out of the box.

Sidenote: Thunderbird seems to have something silly going on. When you add a new account to it, it has a drop-down box to let you select the authentication method (or go for “Autodetect”). Unfortunately, the drop-down does not have the OAuth2 option at all. Even if you select imap.gmail.com as the server — I know hardcoding is bad, but not allowing it at all sounds worse. But if you cheat and give it 12345 as password, and select password authentication just to go through with adding the account, then you can select OAuth 2 as authentication type and it all works out.

Anyway, neither ProtonMail nor FastMail appear to have implemented this authentication method, despite the fact that, if I understood that correctly, it’s supported out of the box on Thunderbird, Apple’s Mail, and a bunch of other mail clients. Indeed, if you want to use IMAP/SMTP with FastMail, they only appear to give you the option to use application-specific passwords, which are a shame.

So why did I need IMAP access to begin with? Well, I wanted to import all my mail from Gmail into ProtonMail, and I though the easier way to do so was going to be through Thunderbird and manually copy the folders I needed. That turned out to be a mistake: Thunderbird crashed while trying to copy some of the content over, and I effectively was spending more time while waiting for it to index anything than instructing it on what to do.

Luckily there’s alternative options for this.

Important Importing Tooling

ProtonMail provides another piece of software, in addition to the Bridge, to paying customers: an Import Tool. This allows you to login to another IMAP server, and copy over the content. I decided to use that to copy over my Gmail content to ProtonMail.

First of all, the tool does not support OAuth2 authentication. To be able to access Gmail or GSuite mailboxes, it needs to use an Application-Specific Password. Annoying but not a dealbreaker for me, since I’m not enrolled in the Advanced Protection Program, which among other things disable “less-secure apps” (i.e. those apps using Application-Specific Passwords). I generated one, logged in, and selected the labels I wanted to copy over, then went to bed, a little, but not much, concerned over the 52 and counting messages that it said it was failing to import.

I woke up to the tool reporting only 32% of around fifty thousands messages imported. I paused, then resumed, the import hoping to getting it unstuck, and left to play Pokémon with my wife, coming back to a computer stuck exactly at the same point. I tried stopping and closing the Import Tool, but that didn’t work, it got stuck. I tried rebooting Windows and it refused to, because my C: drive was full. Huh?

When I went to look into it, I found a 436GB text file, that’s the log from the software. Since the file was too big to open with nearly anything on my computer, I used good old type, and beside the initial part possibly containing useful information, most of the file repeated the same error message about not being able to parse a mime type, with no message ID or subject attached. Not useful. I had to delete the file, since my system was rejecting writes because of the drive being full, but it also does not bode well for the way the importer is written: clearly there’s no retry limit on some action, no log coalescing, and no security feature to go “Hang on, am I DoSing the local system?”

I went looking for tools I could use to sync IMAP servers manually. I found isync/mbsync, which as a slightly annoyance is written in C and needs to be built, so not easy to run on Windows where I do have the ProtonMail bridge, but not something I can’t overcome. When I was looking at the website, it said to check the README for workarounds needed with certain servers. Unfortunately at the time of writing the document, in the Compatibility section, refers to “M$ Exchange” — which in 2020 is a very silly, juvenile, and annoying way to refer to what is possibly still the largest enterprise mail server out there. Yes, I am judging a project by its README the way you judge a book by its cover, but I would expect that a project unable to call Microsoft by its name in this day and age is unlikely to have added support for OAuth2 authentication or any of the many extensions that Gmail provides for efficient listing of messages.

I turned to FastMail to see how they are implementing it: importing Gmail or GSuite content can be done directly on their server side: they require you to provide OAuth2 access to all your email (but then again, if you’re planning to use them as your service provider, you kind of are already doing that). It does not allow you to choose which labels you want to import: it’ll clone everything, even your trash/bin folder. So at the time of writing it’s importing 180k messages. It took a while, and it showed the funny result of saying «175,784 of 172,368 messages imported.» Bonus point to FastMail for actually sending the completion note as an email, so that it can be fetched accordingly.

A side effect of FastMail doing the imports server side is that there’s no way for you to transfer ProtonMail boxes to FastMail, or any other equivalent server with server-side import: the Bridge needs to run on your local system for you to authenticate. It’s effectively an additional lock-in.

Instead of insisting on self-hosting options, I honestly feel that the FLOSS activists should maybe invest a little more thought and time on providing ways for average users with average needs to migrate their content, avoiding the lock-in. Because even if the perfect self-hosting email solution is out there, right now trying to migrate to it would be an absolute nightmare and nobody will bother, preferring to stick to their perfectly-working locked-in cloud provider.

Missing Features Mayhem

At that point I was a bit annoyed, but I had no urgency to move the old email away, for now at least. So instead I went on to check how ProtonMail worked as primary mail interface. I changed MX around, set up the various verification methods, and waited. One of the nice things of migrating the mail provider is that you end up realizing just how many mailing lists and stuff you keep receiving, that you previously just filed away with filters.

I removed a bunch of subscriptions to open source mailing lists for projects I am no longer directly involved in, and unlikely to go back to, and then I started looking at other newsletters and promotions. For at least one of them, I thought I would probably be better served by NewsBlur‘s newsletter-to-RSS interface. As documented in the service itself, the recommended way to use this is to create a filter that takes the input newsletter and forwards them to your newsblur alias.

And here’s the first ProtonMail feature that I’m missing: there’s no way to set up forwarding filters. This is more than a bit annoying: there was mail coming to my address that I used to forward to my mother (mostly bills related to her house, before I set up a separate domain with multiple aliases that point at our two addresses), and there still are a few messages that come to me only, that I forward to my wife, where using our other alias addresses is not feasible for various reasons.

But it’s not just a matter of forwards that is missing. When I looked into the filter system of ProtonMail I found it very lacking. You can’t filter based on an arbitrary header. You cannot filter based on a list-id! Despite the webmail being able to tell that an email came through from a mailing list, and providing an explicit Unsubscribe button, based on the headers, it neither has a “Filter messages like these” like Gmail has, nor a way to select this manually. And that is a lot more annoying.

FastMail, by comparison, provides a much more detailed rules support, including the ability to provide them directly in Sieve language, and it allows forward-and-delete of email as well, which is exactly what the NewsBlur integration needs (although to note, while you can see the interface for do that, trial accounts can’t set up forwarding rules!) And yes, the “Add Rule from Message” flow defaults to the list identifier for the messages. Also, to one-up even Gmail on this, you can set those rules from the mobile app as well — and if you think this is not that big of a deal, just think of much more likely you are to have spare time to do this kind of boring tasks while waiting for your train (if you commute by train, that is).

In terms of features, it seems like FastMail has the clear upper hand. Even ignoring the calendar provided, it supports the modern “Snooze” concept, letting mail show up later in the day or the week (which is great when, say, you don’t want to keep the unread email about your job interviews to show up on your mail inbox at the office), and it even has the ability to permanently delete messages in certain folders on after a certain amount of days — just like gmaillabelpurge! I think this last feature is the one that made me realize I really just need to use FastMail.

Sending It All Out

As I said earlier, even before trying to decide which one of the two providers to try, I gave up on the idea of being able to use either of them with git send-email to send kernel patches and similar. Neither of them supports OAuth2 authentication, and I was told there’s no way to set up a “send-only” environment.

My solution to this was to bite the bullet and deal with a real(ish) sendmail implementation again, by using a script that would connect over SSH to one of my servers, and use the postfix instance there (noting that I’m trying to cut down on having to run my own servers). I briefly considered using my HTPC for that, but then I realized that it would require me to put my home IP addresses in the SPF records for my domain, and I didn’t really want to publicise those as much.

But it turned out the information I found was incorrect. FastMail does support SMTP-only Application Specific Passwords! This is an awesomely secure feature that not even Gmail has right now, and it makes it a breeze to configure Git for it, and the worst that can happen is that someone can spoof your email address, until you figure it out. That does not mean that it’s safe to share that password around, but it does make it much less risky to keep the password on, say, your laptop.

I would even venture that this is even safer than the sendgmail approach that I linked above, as the other one requires full mail access with the token, which can easily be abused by an attacker.

Conclusion

So at the end of this whole odyssey, I decided to stick with FastMail.

ProtonMail sounds good on paper, but it give me the impression that it’s overengineered in implementation, and not thought out enough in feature design. I cannot otherwise see how many basic features (forwarding filters, send-only protocol support — C-S-c to add a CC line) would otherwise be missing. And I’m very surprised about the security angle for the whole service.

FastMail does have some rough edges, particularly on their webapp. Small things, like being able to right-click to get a context menu would be nice. U2F support is clearly lacking: having it work on their Android app for me would be a huge step forward. And I should point out that FastMail has a much friendlier way to test its service, as the 30 days free option includes nearly all of the functionality and enough space to test an import of the data from a 10 years old Gmail.

Why is `git send-email` so awful in 2017?

I set myself out to send my first Linux kernel patch using my new Dell XPS13 as someone contacted me to ask for help supporting a new it87 chip in the gpio-it87 driver I originally contributed.

Writing the (trivial) patch was easy, since they had some access to the datasheet, but then the problem came to figure out how to send it over to the right mailing list. And that took me significantly more time than it should have, and significantly more time than writing the patch, too.

So why is it that git send-email is still so awful, in 2017?

So the first problem is that the only way you can send these email is either through a sendmail-compatible interface, which is literally an interface older than me (by two years), or through SMTP directly (this is even older, as RFC 821 is from 1982 — but being a protocol, that I do expect). The SMTP client has at least support for TLS, provided you have the right Perl modules installed, and authentication, though it does not support more complex forms of authentication such as Gmail’s XOAUTH2 protocol (ignore the fact it says IMAP, it is meant to apply to both IMAP and SMTP).

Instead, the documented (in the man page) approach for users with Gmail and 2FA enabled – which should be anybody who wants to contribute to the Linux kernel! – is to request an app-specific password and saving it through the credential store mechanism. Unfortunately the default credential store just stores it as unencrypted plaintext. Instead there are a number of credential helpers you can use, either using Gnome Keyring or libsecret, and so on.

Microsoft maintains and releases its own Credential Manager which is designed to support multi-factor login to a number of separate services, including GitHub and BitBucket. Thank you, Microsoft, although it appears to only be available for Windows, sigh!

Unfortunately it does not appear there is a good credential helper for either KWallet or LastPass which would have been interesting — to a point of course. I would probably never give LastPass an app-specific password to my Google account, as it would defeat my point of not keeping that particular password in a password manager.

So I start looking around and I find that there is a tool called keyring2 which supposedly has kwallet support, though on Arch Linux it does not appear to be working (the kwallet support, not the tool, which appear to work fine with Gnome Keyring). So I checked out the issues, and the defaulting to gnome-keyring is known, and there is a feature request for a LastPass backend. That sounds promising, right? Except that the author suggests building it as a separate library, which makes sense to a point. Unfortunately the implicit reference to their keyrings.alt (which does not appear to support KDE/Plasma), drove me away from the whole thing. Why?

License is indicated in the project metadata (typically one or more of the Trove classifiers). For more details, see this explanation.

And the explanation then says:

I acknowledge that there might be some subtle legal ramifications with not having the license text with the source code. I’ll happily revisit this issue at such a point that legal disputes become a real issue.

Which effectively reads to me as “I know what the right thing to do is, but it cramps my style and I don’t want to do it.” The fact that there have been already people pointing out the problem, and the fact that multiple issues have been reported and then marked as duplicate of this master issue, should speak clearly enough.

In particular, if I wanted to contribute anything to these repositories I would have no hope to do so but in my free time, if I decide to apply for a personal project request, as these projects are likely considered “No License” by the sheer lack of copyright information or licenses.

Now, I know I have not been the best person for this as well. But at least for glucometerutils I have made sure that each file lists its license clearly, and the license is spelt out in the README file too. And I will be correcting some of my past mistakes at some point soon, together with certain other mistakes.

But okay, so this is not a viable option. What else remains to use? Well, turns out that there is an actual FreeDesktop.org specification, or at least a draft, which appears to have been last touched seven years ago, for a common API to share between GNOME and KWallet, and for which there are a few other implementations already out there… but the current KWallet does not support it, and the replacement (KSecretService) appears to be stalled/gone/deprecated. And that effectively means that you can’t use that either.

Now, on Gentoo I know I can use msmtp integrated with KWallet and the sendmail interface, but I’m not sure if in Arch Linux it would work correctly. After all I even found out that I needed to install a number of Perl modules manually, because they are not listed in the dependencies and I don’t think I want to screw with PKGBUILD files if I can avoid it.

So at the end of the day, why is git send-email so awful? I guess the answer is that in so many years we still don’t have a half-decent, secure replacement for sending email. We need what they would now call “disruptive technology”, akin to how SSH killed Telnet, to bring up a decent way to send email, or at least submit Git patches to the Linux kernel. Sigh.

Update 2020-08-29: if you are reading this to try to make sense on how to use git send-email with Gmail or GSuite, you may want to instead turn to the sendgmail binary released in the gmail-oauth2-tools repository. It’s not great, particularly as the upstream maintainer has been very non-responsive, even when I was a co-worker, and it’s not the easiest thing to setup either (it needs you to have a Google Cloud account and enable the right API key), but it does work. If you feel like forking this, and merging the requisite pull requests, and release it as its own application, please be my guest. I’m not using Gmail anymore myself, so…

I’ll stick with Thunderbird still

Even though it hasn’t been an year yet that I moved to KDE, after spending a long time with GNOME 2, XFCE and then Cinnamon, over the past month or so I looked at how much of non-KDE software I could ditch this time around.

The first software I ditched was Pidgin — while the default use of GnuTLS caused some trouble KTP works quite decently. Okay some features are not fully implemented, but the basic chat works, and that’s enough for me — it’s not like I used much more than that on Pidgin either.

Unfortunately, when yesterday I decided to check whether it was possible to ditch Thunderbird for KMail, things didn’t turn out as nice. Yes, the client improved a truckload since what we had at KDE 3 time — but no, it didn’t improve enough for make it usable for me.

The obvious problem zeroth is the dependencies: to install KMail you need to build (but don’t need to enable) the “semantic desktop” — that is, Nepomuk and the content indexing. In particular it brings in Soprano and Virtuoso that have been among the least usable components when KDE4 was launched (at least Strigi is gone with 4.10; we’ll see what the future brings us). So after a night rebuilding part of the system to make sure that the flags are enabled and the packages in place, today I could try KMail.

First problem — at the first run it suggested importing data from Thunderbird — unfortunately it completely stuck there, and after over half an hour it went nowhere. No logs, no diagnostic, just stuck. I decided to ignore it and create the account manually. While KMail tried to find automatically which mail servers to use, it failed badly – I guess it tried to look for some _imap._srv.flameeyes.eu or something, which does not exist – even though Thunderbird can correctly guess that my mail servers are Google’s.

Second problem — the wizard does not make it easy to set up a new identity, which makes it tempting to add the accounts manually, but since you got three different entries that you have to add (Identity, Sending account, Receiving account), adding them in the wrong order gets you to revisit the settings quite a few times. For the curious, the order is sending, identity, receiving.

Third problem — KMail does not implement the Special Folder extension defined in RFC 6154 which GMail makes good use of (it actually implements it both with the standard extension and their own). This means that KMail will store all messages locally (drafts, sent, trash, …) unless you manually set them up. Unlike what somebody have told me, this means that the extension is completely unimplemented, not implemented only partially. I’m not surprised that it’s not implemented, by the way, due to the fact that the folders are declared in two different settings (the identity and the account).

Fourth problem — speaking about GMail, there is no direct way to handle the “archive” action, which is almost a necessity if you want to use it. While this started with GMail and as an almost exclusive to that particular service, nowadays many other services, including standalone software such as Kerio, provide the same workflow; the folder used for archiving is, once again, provided with the special-use notes discussed earlier. Even though the developers do not use GMail themselves, it feels wrong that it’s not implemented.

Fifth problem — while at it, let’s talk a moment about the IDLE command implementation (one of the extensions needed for Push IMAP). As Wikipedia says, KMail implements support for it since version 4.7 — unfortunately, it’s not using it in every case, but only if you disable the “check every X minutes” option — if that is enabled, then the IDLE command is not used. Don’t tell me it’s obvious, because even though it makes sense under some point of views, I wasn’t the only one that was tricked by that. Especially since I read that setting first as “disable if you only want manual check for new mail” — Thunderbird indeed uses IDLE even if you set the scheduled check every few minutes.

Sixth problem — there is no whitelist for remote content on HTML emails. GMail, both web and on the clients, Android and iOS, supports a complete whitelist, separate from everything else. Thunderbird supports a whitelist by adding the sender to the contacts’ list (which is honestly bothersome when adding mailing lists, like in my case). As far as I could tell, there is no way to have such a whitelist on KMail. You either got the protection enabled, or you got it disabled.

The last problem is the trickiest, and it’s hard to tell if it’s a problem at all. When I went to configure the OpenPGP key to use, it wouldn’t show me anything to select at all. I tried for the good part of an hour trying to get it to select my key, and it failed badly. When I installed Kleopatra it worked just fine; on the other hand, Pesa and other devs pointed out that it works for them just fine without Kleopatra installed.

So, what is the resolution at this point, for me? Well, I guess I’ll have to open a few bug feature requests on KDE’s Bugzilla, if I feel like it, and then I might hope for version 4.11 or 4.12 to have something that is more usable than Thunderbird. As it is, that’s not the case.

There are a bunch of minor nuisance and other things that require me to get used to them, such as the (in my view too big) folder icons (even if you change the size of the font, the size of the icon does not change), and the placement of the buttons which required me to get used to it on Thunderbird as well. But these are only minor annoyances.

What I really need for KMail to become my client is a tighter integration with GMail. It might not suit the ideals as much as one might prefer, but it is one of the most used email providers in the world nowadays, and it would go a long way for user friendliness to work nicely with it instead of making it difficult.

I think I’ll keep away from Python still

Last night I ended up in Bizarro World, hacking at Jürgen’s gmaillabelpurge (which he actually wrote on my request, thanks once more Jürgen!). Why? Well, the first reason was that I found out that it hasn’t been running for the past two and a half months, because, for whatever reason, the default Python interpreter on the system where it was running was changed from 2.7 to 3.2.

So I tried first to get it to work with Python 3 keeping it working with Python 2 at the same time; some of the syntax changes ever so slightly and was easy to fix, but the 2to3 script that it comes with is completely bogus. Among other things, it adds parenthesis on all the print calls… which would be correct if it checked that said parenthesis wouldn’t be there already. In a script link the one aforementioned, the noise on the output is so high that there is really no signal worth reading.

You might be asking how comes I didn’t notice this before. The answer is because I’m an idiot! I found out only yesterday that my firewall configuration was such that postfix was not reachable from the containers within Excelsior, which meant I never got the fcron notifications that the job was failing.

While I wasn’t able to fix the Python 3 compatibility, I was able to at least understand the code a little by reading it, and after remembering something about the IMAP4 specs I read a long time ago, I was able to optimize its execution quite a bit, more than halving the runtime on big folders, like most of the ones I have here, by using batch operations, and peeking, instead of “seeing” the headers. At the end, I spent some three hours on the script, give or take.

But at the same time, I ended up having to workaround limitations in Python’s imaplib (which is still nice to have by default), such as reporting fetched data as an array, where each odd entry is a pair of strings (tag and unparsed headers) and each even entry is a string with a closed parenthesis (coming from the tag). Since I wasn’t able to sleep, at 3.30am I started re-writing the script in Perl (which at this point I know much better than I’ll ever know Python, even if I’m a newbie in it); by 5am I had all the features of the original one, and I was supporting non-English locales for GMail — remember my old complain about natural language interfaces? Well, it turns out that the solution is to use the Special-Use Extension for IMAP folders; I don’t remember this explanation page when we first worked on that script.

But this entry is about Python and not the script per-se (you can find on my fork the Perl version if you want). I have said before I dislike Python, and my feeling is still unchanged at this point. It is true that the script in Python required no extra dependency, as the standard library already covered all the bases … but at the same time that’s about it: it is basics that it has; for something more complex you still need some new modules. Perl modules are generally easier to find, easier to install, and less error-prone — don’t try to argue this; I’ve got a tinderbox that reports Python tests errors more often than even Ruby’s (which are lots), and most of the time for the same reasons, such as the damn unicode errors “because LC_ALL=C is not supported”.

I also still hate the fact that Python forces me to indent code to have blocks. Yes I agree that indented code is much better than non-indented one, but why on earth should the indentation mandate the blocks rather than the other way around? What I usually do in Emacs when I’m getting stuff in and out of loops (which is what I had to do a lot on the script, as I was replacing per-message operations with bulk operations), is basically adding the curly brackets in different place, then select the region, and C-M- it — which means that it’s re-indented following my brackets’ placement. If I see an indent I don’t expect, it means I made a mistake with the blocks and I’m quick to fix it.

With Python, I end up having to manage the space to have it behave as I want, and it’s quite more bothersome, even with the C-c < and C-c > shortcuts in Emacs. I find the whole thing obnoxious. The other problem is that, while Python does provide basics access to a lot more functionality than Perl, its documentation is .. spotty at best. In the case of imaplib, for instance, the only real way to know what’s going to give you, is to print the returned value and check with the RFC — and it does not seem to have a half-decent way to return the UIDs without having to parse them. This is simply.. wrong.

The obvious question for people who know would be “why did you not write it in Ruby?” — well… recently I’ve started second-guessing my choice of Ruby at least for simple one-off scripts. For instance, the deptree2dot tool that I wrote for OpenRC – available here – was originally written as a Ruby script … then I converted it a Perl script half the size and twice the speed. Part of it I’m sure it’s just a matter of age (Perl has been optimized over a long time, much more than Ruby), part of it is due to be different tools for different targets: Ruby is nowadays mostly a long-running software language (due to webapps and so on), and it’s much more object oriented, while Perl is streamlined, top-down execution style…

I do expect to find the time to convert even my scan2pdf script to Perl (funnily enough, gscan2pdf which inspired it is written in Perl), although I have no idea yet when… in the mean time though, I doubt I’ll write many more Ruby scripts for this kind of processing..

Backing up cloud data? Help request.

I’m very fond of backups, after the long series of issues I’ve had before I started doing incremental backups.. I still have got some backup DVDs around, some of which are almost unreadable, and at least one that is compressed with the xar archive in a format that is no longer supported, especially on 64-bit.

Right now, my backups are all managed through rsnapshot, with a bit of custom scripts over it to make sure that if an host is not online, the previous backup is maintained. This works almost perfectly, if you exclude the problems with restored files and the fact that a rename causes files to double, as rsnapshot does not really apply any data de-duplication (and the fdupes program and the like tend to be .. a bit too slow to use on 922GB of data).

But there is one problem that rsnapshot does not really solve: backup of cloud data!

Don’t get me wrong: I do backup the (three) remote servers just fine, but this does not cover the data that is present in remote, “cloud” storage, such as the GitHub, Gitorious and BitBucket repositories, or delicious bookmarks, GMail messages, and so on so forth.

Cloning the bare repositories and backing those up is relatively trivial: it’s a simple script to write. The problem starts with the less “programmatic” services, such as the noted bookmarks and messages. Especially with GMail as copying the whole 3GB of data each time from the server is unlikely to work fine, it has to be done properly.

Has anybody any pointer on the matter? Maybe there’s already a smart backup script, similar to tante’s smart pruning script that can take care of copying the messages via IMAP, for instance…

GMail backup, got any ideas?

I’ve been using GMail as my main email system for quite a few years now, and it works quite well; most of the times at least. While there are some possible privacy concerns, I don’t really get most of them (really, any other company hosting my mail will have similar concerns; I don’t have the time to manage my own server, and the end result is, if you don’t want anybody but me to read your mail, encrypt it!). Most of my problems related to GMail are pretty technical.

For instance, for a while I struggled with cleaning up old cruft from the server, removing old mailing list messages, old promotional messages or the announcements coming from services like Facebook, WordPress.com and similar. Thankfully, Jürgen wrote for me a script that takes care of the task. Now you might wonder why there is the need for a custom script to handle deletion of mail on GMail, given it uses the IMAP protocol… the answer is that even though it uses the IMAP interface, the way they store the messages makes it impossible to use the standard tools. The usual deletion scripts you may find for IMAP mailboxes set the deleted flag on and then expunge the folder… but that’s just going to archive the messages on GMail, you got to move the messages to the Trash folder… whose path depends on the language you set on GMail’s interface.

Now the same applies to the problem of backup: while I trust GMail will be available for a long time, I don’t like the idea of having no control whatsoever on the backup of my mail. I mean complete backup. Backup of the almost 3GB of mail that GMail is currently storing for me. Using standard IMAP backup software will probably require me something like 6GB of storage, and transfer as well! The problem is that all the content available in the “Labels” (which are most definitely not folders) is duplicated in the “All Mail” folder.

A proper GMail-designed interface would only fetch the content of the messages from the “All mail” folder, and then just use the message IDs to file the messages with the correct label. An even better software would allow me to convert the backed-up mess into something that can be served properly by an email client or an IMAP server, using Maildir structures.

it goes without saying that it should work incrementally: if I run it daily I don’t want it to fetch 3GB of data every time; I can bear with it the first time, but that’s about it. And it should also rate-limit itself to avoid hitting the GMail lockdown for possible abuse!

As far as I can see, there is no software to do that, I most definitely have no time to work on it… does anybody feel like doing so, or to find me the software I’m looking for? No in this case I most definitely don’t intend using proprietary software, no matter how handy it is: it’s going to handle very sensitive information, like my GMail password, and that’s not something that I’d keep available to a software I can’t look at the sources of.

Why natural language interfaces suck

While I’m a fervid proposer of native language support in all kind of software, which includes not only being able to display and make use of native characters (like the ò character in my surname) but also user interface translation and adaptation for the user’s language, I have a huge beef with what I’ll call “natural language interfaces” in this post.

The most widely known natural language interface is the formula language used by spreadsheet software, like OpenOffice Calc and Microsoft Excel. Since both applications are designed to be used by accountants for the most part, they try not to require of them any kind of generic programming skill. Which seem to still include “no knowledge of English”, even though nowadays I’d expect all of them to know it at the tip of their fingers anyway.

At any rate, the language used for the formula is not independent from the language: it changes both function’s names and data formats depending on the selected language. So not only the SUM() function becomes SOMMA() in Italian, the decimal separator character changes from . to , with the obvious problems tied to that (if they are not obvious to you, comma is still the parameters’ separator as well!). I’m not sincerely sure whether internally the two spreadsheets save a generic ID of the function or the name in the local language; I sincerely hope the former, but either way the thing is already quite brain-damaged for me.

But you don’t have to go down the drain to the programming languages to find of places where natural language interfaces do suck. One other example is something so widespread one would probably not think of it: GMail, or Google Mail (and this will come obvious in a moment, why I do precise both names). I guess this also counts like a further example of Google’s mediocrity but I’m not stressing that out; it’s one (somewhat smaller) fault in a product that is, otherwise, great, especially for Google.

Now, you might not know – I didn’t either till a few months back – that GMail is not called GMail in Germany; Jürgen explained this to me when he wrote gmaillabelpurger (one heck of a magic tool for me; it saved me already so much time; especially load time for IMAP access): because of trademark issues they had to fold back to call it “Google Mail” there, thus creating one further domain (even though users are mapped 1:1 on both, which makes most of the point moot I guess). When the user has registered in Germany, it’s not only the web interface to change, but also the IMAP folder hierarchy: the [Gmail] prefix in the service folders’ names changes to [Google Mail].

This would only have mattered for the small error I got when I first tried Jürgen’s script (as he wrote it with the German interface in mind) if not for another issue. Using GMail with the default English language selects the “American” variant. And such variant also affects the dates shown in the web inteface; and since I don’t usually like dealing with stupid date formats (don’t try to say that mm/dd/yyyy is not stupid!) the other day, when I needed to use it to look up a timeline for work mail messages, I switched the interface to “English UK”, which solved the problem at the time for me.

Fast forward a couple of days and I notice that the script is not behaving as it should as messages are not deleted; a quick look has shown me the problem: Gmail’s IMAP interface is affected by the language settings in the web interface! What that comes down to be at that point is that the old Trash folder gets renamed into Bin; d’uh! And even worse, setting the UK variant for the language causes some quite large confusion with the trademarked names: the web interface still reports GMail, but on the other hand, [Google Mail] is used in the IMAP interface. And that’s with me still connecting from an Italian IP address.

Now, thanks to Jürgen the script works again and thus my problem is solved. But it really should show that writing interfaces that depend on the language of the user isn’t really an excessively smart move.

I also start to wonder how soon I’ll get used to move my mail to the bin, rather than trash it.

Google and software mediocrity

I haven’t commented very much, if at all, on most of the new Google projects, which include Chrome, Chromium and Chrome OS; today since I’m currently waiting on a few long-running tasks to complete, I’d like to spend my two eurocents on it.

You can already guess from the title of this post that I’m really sceptical about Google entering the operating system marked; the reason for that is that I haven’t really seen anything in Google strategy that would leave us expecting a very good product from them in this area. While Google is certainly good in providing search services, and GMail is also my email provider of choice, there are quite a few shortcomings that I see in their software and that does not make me count on Chrome OS being any more good that Windows XP is.

First, let’s just say that Google Chrome is not the first software that Google released for the desktop; there has been quite a few other projects before, like for instance Google Talk. Since I have a personal beef with this, I’d like to go on a bit about it. When Google launched their own Instant Message service for the masses, through GMail and a desktop, called Google Talk and base on the XMPP protocol, there has been quite some talk around because, while using the same protocol we know as Jabber, it didn’t connect to the Server-to-Server Jabber network that allows for multiple Jabber servers’ users to communicate; with time this S2S support was added and now a GTalk user can talk with any Jabber user, so as a service, it’s really not bad at all, and you can use any Jabber client to connect to GTalk.

The Windows client, though, seems to be pretty much abandoned, I haven’t seen updates in a while (although I might not have noticed in the past month or two), it lacks quite a few features like merging of multiple usernames in a single contact and stuff like that. Now, at the same time as releasing the Windows client, or about the same time, Google released specifics for their extensions that allow audio (and video?) chat over XMPP-negotiated connection, and a library (libjingle) for other clients to implement this protocol.

The library, unfortunately, ended up having lots of shortcomings, and most projects decided to import and modify it, then it was forked, at least once but I think even twice, cut down and up and so much mangled that it doesn’t probably look anywhere like the original one from Google. And yet, the number of clients that do support GTalk audio/video extension is… I have no idea, Empathy does support it if I recall correctly, but last time I tried, it didn’t really work that well. As far as I know, libpurple, that is used by both Pidgin and Adium, and which would cover clients for all the major operating systems (free or not) does not seem to support them.

Now, why do I consider GTalk a mediocre software does not limit itself to the software that Google provides, it’s a matter of how they played their cards. It seems to me that instead of trying to push themselves as the service provider, they wanted to push themselves as a software provider as well, and the result is that beside Empathy (which is far from an usable client in my opinion), there is no software that seems to be implementing their service properly. They could have implemented, or paid to implement or something like that, their extensions in libpurple and that would have given them an edge; they could have worked with Apple (considering they are working with them closely already) so that iChat could work with GTalk’s audio and video extensions (instead iChat AV from Leopard uses a different protocol that only works between Macs), and so on.

What about Google Chrome? Well when it was announced and released I was blocked in hospital so I lost most of the hype done in the first days; when I finally went to test it, almost a month later, I was surprised at how pointless it seemed to me. Why? Because for what I can see it does not render text as good as Firefox or Safari on Windows, it’s probably faster than them, but then again most people don’t care (at least in Italy, Internet connections are so slow you don’t notice), and there is one important problem: the Google bias of the browser.

I think lots of people criticised the way Microsoft originally treated Internet Explorer and their Internet services before. to the point that now Microsoft allows you to set Google as provider for search in the default install. Well, I don’t see Chrome as anything much different: it’s a browser that is tailored to suit Google’s services, and of course the development of it will suit that too. Will it ever get an advertising block feature, like is available for Firefox, Konqueror and Safari? Probably not because Google takes a good share of revenue out of Internet-based advertising. Will it ever get a delicious extension? Probably not because that’s a Yahoo! service nowadays, and Google has its own alternative.

Now, I don’t want to downplay the important technical innovation of Google chrome, even when they are very basic like the idea of splitting the tabs by process; and indeed I think I have read that Mozilla is now working on implementing a similar feature on the next Firefox major change; this is what we actually get out of the project, not Chromium itself.

Then there is Android; I don’t think I can really comment on this, but at least for what I can see, there is not really much going on with Android: nobody asked me yet if I develop for Android, while I got a few requests for Symbian and iPhone development in the past year or so. Android phones does not seem to shine with the non-technical people, and the technical people at least in Italy are unlikely to pay the price you got to pay to get the Android-based HTC phones with Vodafone and TIM.

By contrasting with Nokia, Google fragmented the software area even more. While Google already provided mobile-optimised services on the web, and some Java-based software to access their services with J2ME-compatible phones, they also started providing applications for Nokia’s Symbian-based phones. Unfortunately this software does not shine, with the exception of Google Maps, which works pretty well and integrates itself with Nokia pretty decently; in particular the “main” Google application for Nokia, crashed twice my E75!, I ended up removing it and living without it (the YouTube application sort of works, the GMail application also “sort of” works, but with the new IMAP client is really pointless to me). So we have mediocre software from Google for Nokia phone, and probably no good reason for Google to improve on it.

But there are also things that haven’t been implemented by Google at all, for instance there is no GTalk client for Nokia phones, or a web-based version for mobile phones, which would have been a killer feature! Instead Nokia implemented its own Nokia Chat, which now became Contacts for Ovi, which also uses XMPP, which also has S2S, but which does not allow you to use GTalk accounts requiring you to have two different users: one for computers and one for the mobile phone. And similarly, with just partially-working Google Sync for Nokia phones, in particular with no support for syncing with the Google Calendar, and with a tremendous loss of detail when syncing contacts, Google loses to Nokia’s Ovi sync support as well.

Now, I’m not a market analyst and I really like to stay away from marketing, but I really don’t see Google as a major player for Software development, I’d really have preferred they started improving the integration of their services with Free Software like Evolution (whose Google Calendar integration sucks way too much, and whose IMAP usage of GMail causes two copies of each sent message to be stored on the server, as well as creating a number of folders/labels that shouldn’t be there at all!), rather than having a new “operating system”.

There are more details I’m sceptic about, like hardware support (of which I’ll leave Mathew Garrett to explain since he knows the matter better) and software support, but for those I’ll wait to see when they actually deliver something.

Productivity improvement weekend

This weekend I’m going to try my best to improve my own productivity. Why do I say that? Well there are quite a few reasons for it. The first is that I spent the last week working full time on feng, rewriting the older code to replace it with simpler, more tested and especially well documented code. This is not an easy task especially because you often end up rewriting other parts to play nicely with the new parts; indeed to replace bufferpool, between me and Luca we rewrote almost entirely the networking code.

Then there is the fact that I finally got a possible price to replace the logic board of my MacBook Pro that broke a couple of weeks ago: €1K! That’s almost as much as a new laptop; sure not the same class, but still. In the mean time I bought an iMac; I needed access to QuickTime, even more than I knew before, because we currently don’t have a proper RTSP client; MPlayer does not support seeking, FFplay is broken for a few problems, and VLC also does not behave in a very standard compliant way. QuickTime is, instead, quite well mannered. But this means I have spent money to go on with the job, which is, well, not exactly the nicest thing you can do if you need to pay some older debts too.

So it means I have to work more; not only I have to continue my work on lscube at full time, but I’m going to have to get more jobs to the side; I got asked for a few projects already, but most seem to require me to learn new frameworks or even new programming languages, which means they require a quite big effort. I need the money so I’ll probably pick them but it’s far from optimal. I’ve also put on nearly-permanent hold the idea of writing an autotools guide, either as an open book or a real book; the former has shown no interest among readers of my blog, the latter has shown no interest among publisher. I start to feel like an endangered species regarding autotools, alas.

But since at least for lscube I need to have access to the FFmpeg mailing list, and I need access to the PulseAudio mailing list for another project and so on so forth, I need to solve one problem I already wrote about, purging GMail labels out of older messages. I really have a need for this to be solved, but I’m still not totally in luck. Thanks to identi.ca, I was able to get the name of a script that is designed to solve the same problem: imap-purge . Unfortunately there is a problem with one GMail quirk: deleting a message from a “folder” (actually a GMail label) does not delete the message from the server, it only detach the label from that message; to delete a message from the server you’ve got to move it to the Trash folder (and either empty it or wait for 30 days so that it gets deleted). I tried modifying imap-purge to do that, but my Perl is nearly non-existent and I couldn’t even grok the documentation of Mail-IMAPClient regarding the move function.

So this weekend either I find someone to patch imap-purge for me or I’ll have to write my own script based on its ideas in Ruby or something like that. Waste of time from one side, but should allow me to save time further on.

I also need to get synergy up to speed in Gentoo, there have been a few bugs opened regarding crashes and other problems and requests for startup scripts and SVN snapshots; I’ll do my best to work on that so that I can actually use a single keyboard and mouse pair between Yamato and the iMac (which I called, with a little pun, USS Merrimac (okay I’m a geek). Last time I tried this, I had sme problems with synergy deciding to map/unmap keys to compensate the keyboard difference between X11 and OSX; I hope I can get this solved this time because one thing I hate is having different key layout between the two.

I also have to find a decent way to have my documents available on both OS X and Linux at the same time, either by rsyncing them in the background or sharing them on NFS. It’s easier if I got them available everywhere at once.

The tinderbox is currently not running, because I wouldn’t have time to review the build logs, in the past eight days I turned on the PlayStation 3 exactly twice, one earlier today to try relaxing with Street Fighter IV (I wasn’t able to), and the other time just to try one thing about UPnP and HD content. I was barely able to watch last week’s Bill Maher episode, and not much more. I seriously lack the precious resource that time is. And this is after I show the thing called “real life” almost entirely out of the door.

I sincerely feel absolutely energy-deprived; I guess it’s also because I didn’t have my after-lunch coffee, but there are currently two salesman boring my mother with some vacuum cleaner downstairs and I’d rather not go meet them. Sigh. I wish life were easy, at least once an year.

Questing for the guide

I was playing some Oblivion while on the phone with a friend, when something came up to my mind, related to my recent idea of an autotools guide . The idea came up in my mind by mixing Oblivion with something that Jürgen was saying this evening.

In the game you can acquire the most important magical items in three ways: you can find them around (rarely), you can build them yourselves (by hunting creatures’ souls), you can pay for them with gold, or you can get them during quests. The latter are usually the most powerful but it’s not always true. At any rate, the “gold” option is rarely the one used because it’s a somewhat scarce resource. You might start to wonder what this has to do with the autotools guide that I’ve made public yesterday, but you might also have already seen where I’m going.

Since I’m the first one to know that money, especially lately, is a scarce resource, and that, me first, I’m the kind of person who’s glad to put in an effort with a market value three/four times more than whatever money I could afford to repay a favour, it would be reasonable for me to provide a way of “payment” through use of technical skills and effort.

So here is my alternative proposal: if you can get me a piece of code that I failed to find and I don’t have time to write, releasing it under a FOSS license (GPLv2+ is very well suggested; compatibility with GPL is very important anyway), and maintaining it until it’s almost “perfect”, I’ll exchange that for a comparable effort in extending the guide.

I’ll post these “quests” from time to time on the blog so you can see them and see whether you think you can complete them; I’ll have to find a way to index them though, for now it’s just a proposal so I don’t think I need to do this right away. But I can drop two ideas if somebody has time and is willing to work on them; both of them relate to IMAP and e-mail messages, so you’ve been warned. I’m also quite picky when it comes to requirements.

The first, is what Jürgen was looking at earlier: I need a way to delete the old messages from some GMail label every day. The idea is that I’d like to use GMail for my mailing lists needs (so I have my messages always with me and so on), but since keeping the whole archive is both pointless (there is gmane, google groups, and the relative archives) and expensive (in term of space used in the GMail IMAP account and of bandwidth needed to sync “All Mail”, via UMTS), I’d like to just always keep the last 3 weeks of e-mail messages. What I need, though, is something slightly more elaborated than just deleting the old messages. It has to be a script that I can run on a cron job locally, and connects to the IMAP server. It has to allow deleting the messages completely from GMail, which means dropping them in the Trash folder (just deleting them is not enough, you just remove the label), and emptying it too; it also has to be configurable on a per-label basis of time to keep the messages (I would empty the label with the release notifications every week rather than every three weeks), and hopefully be able to specify to keep unread messages longer, and consider flagged messages as protected. I don’t care much about implementation language but I’d frown up at things “exotic” like ocaml, smalltalk and similar since it would require me to install their environment. Perl, Python and Ruby all are fine, and Java is too since the thing would run just once a day and is not much of a slowdown to start the JVM for that. No X connection though.

The second is slightly simpler and could be coupled with the one before: I send my database backups from the server to my GMail e-mail address, encrypted with GPG and compressed with BZip2, and then split in message-sized chunks. I need a way to download all the messages and reassemble the backups, once a week, and store it on a flash card, using tar directly on it like it was a tape (no need for a filesystem should reduce the erase count). The email messages have the number of the chunk, the series of the backup (typo or bugzilla) and the date of backup all encoded in the subject. More points if it can do something like Apple’s Time Machine to keep backups each day for a week, each week for a month (or two) and then a backup a month up to two years.

So if somebody has the skill to complete these tasks and would be interested in seeing the guide expanded, well, just go for it!