I’m very fond of backups, after the long series of issues I’ve had before I started doing incremental backups.. I still have got some backup DVDs around, some of which are almost unreadable, and at least one that is compressed with the
xar archive in a format that is no longer supported, especially on 64-bit.
Right now, my backups are all managed through
rsnapshot, with a bit of custom scripts over it to make sure that if an host is not online, the previous backup is maintained. This works almost perfectly, if you exclude the problems with restored files and the fact that a rename causes files to double, as rsnapshot does not really apply any data de-duplication (and the
fdupes program and the like tend to be .. a bit too slow to use on 922GB of data).
But there is one problem that rsnapshot does not really solve: backup of cloud data!
Don’t get me wrong: I do backup the (three) remote servers just fine, but this does not cover the data that is present in remote, “cloud” storage, such as the GitHub, Gitorious and BitBucket repositories, or delicious bookmarks, GMail messages, and so on so forth.
Cloning the bare repositories and backing those up is relatively trivial: it’s a simple script to write. The problem starts with the less “programmatic” services, such as the noted bookmarks and messages. Especially with GMail as copying the whole 3GB of data each time from the server is unlikely to work fine, it has to be done properly.
Has anybody any pointer on the matter? Maybe there’s already a smart backup script, similar to tante’s smart pruning script that can take care of copying the messages via IMAP, for instance…