Backing up cloud data? Help request.

I’m very fond of backups, after the long series of issues I’ve had before I started doing incremental backups.. I still have got some backup DVDs around, some of which are almost unreadable, and at least one that is compressed with the xar archive in a format that is no longer supported, especially on 64-bit.

Right now, my backups are all managed through rsnapshot, with a bit of custom scripts over it to make sure that if an host is not online, the previous backup is maintained. This works almost perfectly, if you exclude the problems with restored files and the fact that a rename causes files to double, as rsnapshot does not really apply any data de-duplication (and the fdupes program and the like tend to be .. a bit too slow to use on 922GB of data).

But there is one problem that rsnapshot does not really solve: backup of cloud data!

Don’t get me wrong: I do backup the (three) remote servers just fine, but this does not cover the data that is present in remote, “cloud” storage, such as the GitHub, Gitorious and BitBucket repositories, or delicious bookmarks, GMail messages, and so on so forth.

Cloning the bare repositories and backing those up is relatively trivial: it’s a simple script to write. The problem starts with the less “programmatic” services, such as the noted bookmarks and messages. Especially with GMail as copying the whole 3GB of data each time from the server is unlikely to work fine, it has to be done properly.

Has anybody any pointer on the matter? Maybe there’s already a smart backup script, similar to tante’s smart pruning script that can take care of copying the messages via IMAP, for instance…

5 thoughts on “Backing up cloud data? Help request.

  1. I forward my Google Apps email account (Gmail) to a box I own which ends up getting backed up. This has been a terrific solution for making sure my email is backed up.This won’t work well for your existing mail, but going forward it should satisfy your needs.I have everything going into a maildir-format so I can use the incremental-rsync-backups-using-hardlink trick quite nicely.


  2. About gmail:Sync your messages to maildir periodically using e.g. fetchmail, then apply your ordinary backup tools to this maildir folder.Hardlinks can save space if rsnapshot supports that.


  3. I just use isync (binary is called mbsync) to synchronize IMAP server contents with my local backup.It uses IMAP features to only transfer changes and works with the 35000 email mailbox I have (though it is only 300 MB).


  4. offlineimap is probably what you’re looking for, though it lacks of backup functionality but quite efficiently syncronize imap<>imap or imap<>maildir storages (bidirectionally as an option). As for backup: i moved 2G of my Gmail garbage under Git. (smothing about of+600M packed)


