This Time Self-Hosted
dark mode light mode Search

GMail backup, got any ideas?

I’ve been using GMail as my main email system for quite a few years now, and it works quite well; most of the times at least. While there are some possible privacy concerns, I don’t really get most of them (really, any other company hosting my mail will have similar concerns; I don’t have the time to manage my own server, and the end result is, if you don’t want anybody but me to read your mail, encrypt it!). Most of my problems related to GMail are pretty technical.

For instance, for a while I struggled with cleaning up old cruft from the server, removing old mailing list messages, old promotional messages or the announcements coming from services like Facebook, WordPress.com and similar. Thankfully, Jürgen wrote for me a script that takes care of the task. Now you might wonder why there is the need for a custom script to handle deletion of mail on GMail, given it uses the IMAP protocol… the answer is that even though it uses the IMAP interface, the way they store the messages makes it impossible to use the standard tools. The usual deletion scripts you may find for IMAP mailboxes set the deleted flag on and then expunge the folder… but that’s just going to archive the messages on GMail, you got to move the messages to the Trash folder… whose path depends on the language you set on GMail’s interface.

Now the same applies to the problem of backup: while I trust GMail will be available for a long time, I don’t like the idea of having no control whatsoever on the backup of my mail. I mean complete backup. Backup of the almost 3GB of mail that GMail is currently storing for me. Using standard IMAP backup software will probably require me something like 6GB of storage, and transfer as well! The problem is that all the content available in the “Labels” (which are most definitely not folders) is duplicated in the “All Mail” folder.

A proper GMail-designed interface would only fetch the content of the messages from the “All mail” folder, and then just use the message IDs to file the messages with the correct label. An even better software would allow me to convert the backed-up mess into something that can be served properly by an email client or an IMAP server, using Maildir structures.

it goes without saying that it should work incrementally: if I run it daily I don’t want it to fetch 3GB of data every time; I can bear with it the first time, but that’s about it. And it should also rate-limit itself to avoid hitting the GMail lockdown for possible abuse!

As far as I can see, there is no software to do that, I most definitely have no time to work on it… does anybody feel like doing so, or to find me the software I’m looking for? No in this case I most definitely don’t intend using proprietary software, no matter how handy it is: it’s going to handle very sensitive information, like my GMail password, and that’s not something that I’d keep available to a software I can’t look at the sources of.

Comments 7
  1. Some of the gmaillabelpurge code should still work in that context, lemme look into that.

  2. @carnofflineimap works as an incremental backup solution, but won’t get you rid of duplicates.Kmail has an “offline imap” solution that allows you to avoid syncing certain folders (such as the “All Mail” folder), but you will still have duplicates if you have mails with more than one label.I have configured kmail to have two profiles for the same gmail account:- an IMAP profile where I do my regular mail stuff (reading and writing mails)- an OFFLINE IMAP profile that I use for backup, that only keeps track of my important folders (no mailing lists, newsletters etc.)

  3. isync (the program of which is called mbsync) can exclude certain folders as well.It’s nor perfect, but it works good enough for me – but then I don’t use gmail either.As to handling deleting message: you could probably flag the mails to be deleted in some other way and then run imapfilter to move them to the trash?

Leave a Reply to carnCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.