Tarsnap and backup strategies

After having had a quite traumatic experience with a customer’s service running on one of the virtual servers I run last November, I made sure to have a very thorough backup for all my systems. Unfortunately, it turns out to be a bit too thorough, so let me explore with you what was going on.

First of all, the software I use to run the backup is tarsnap — you might have heard of it or not, but it’s basically a very smart service, that uses an open-source client, based upon libarchive, and then a server system that stores content (de-duplicated, compressed and encrypted with a very flexible key system). The author is a FreeBSD developer, and he’s charging an insanely small amount of money.

But the most important part to know when you use tarsnap is that you just always create a new archive: it doesn’t really matter what you changed, just get everything together, and it will automatically de-duplicate the content that didn’t change, so why bother? My first dumb method of backups, which is still running as of this time, is to simply, every two hours, dump a copy of the databases (one server runs PostgreSQL, the other MySQL — I no longer run MongoDB but I start to wonder about it, honestly), and then use tarsnap to generate an archive of the whole /etc, /var and a few more places where important stuff is. The archive is named after date and time of the snapshot. And I haven’t deleted any snapshot since I started, for most servers.

It was a mistake.

The moment when I went to recover the data out of earhart (the host that still hosts this blog, a customer’s app, and a couple more sites, like the assets for the blog and even Autotools Mythbuster — but all the static content, as it’s managed by git, is now also mirrored and served active-active from another server called pasteur), the time it took to extract the backup was unsustainable. The reason was obvious when I thought about it: since it has been de-duplicating for almost an year, it would have to scan hundreds if not thousands of archives to get all the small bits and pieces.

I still haven’t replaced this backup system, which is very bad for me, especially since it takes a long time to delete the older archives even after extracting them. On the other hand it’s probably a lot of a matter of tradeoff in the expenses as well, as going through all the older archives to remove the old crap drained my credits with tarsnap quickly. Since the data is de-duplicated and encrypted, the archives’ data needs to be downloaded to be decrypted, before it can be deleted.

My next preference is going to be to set it up so that the script is executed in different modes: 24 times in 48 hours (every two hours), 14 times in 14 days (daily), and 8 times in two months (weekly). The problem is actually doing the rotation properly with a script, but I’ll probably publish a Puppet module to take care of that, since it’s the easiest thing for me to do, to make sure it executes as intended.

The essence of this post is basically to warn you all that, no matter whether it’s cheap to keep around the whole set of backups since the start of time, it’s still a good idea to just rotate them.. especially for content that does not change that often! Think about it even when you set up any kind of backup strategy…

Backing up cloud data? Help request.

I’m very fond of backups, after the long series of issues I’ve had before I started doing incremental backups.. I still have got some backup DVDs around, some of which are almost unreadable, and at least one that is compressed with the xar archive in a format that is no longer supported, especially on 64-bit.

Right now, my backups are all managed through rsnapshot, with a bit of custom scripts over it to make sure that if an host is not online, the previous backup is maintained. This works almost perfectly, if you exclude the problems with restored files and the fact that a rename causes files to double, as rsnapshot does not really apply any data de-duplication (and the fdupes program and the like tend to be .. a bit too slow to use on 922GB of data).

But there is one problem that rsnapshot does not really solve: backup of cloud data!

Don’t get me wrong: I do backup the (three) remote servers just fine, but this does not cover the data that is present in remote, “cloud” storage, such as the GitHub, Gitorious and BitBucket repositories, or delicious bookmarks, GMail messages, and so on so forth.

Cloning the bare repositories and backing those up is relatively trivial: it’s a simple script to write. The problem starts with the less “programmatic” services, such as the noted bookmarks and messages. Especially with GMail as copying the whole 3GB of data each time from the server is unlikely to work fine, it has to be done properly.

Has anybody any pointer on the matter? Maybe there’s already a smart backup script, similar to tante’s smart pruning script that can take care of copying the messages via IMAP, for instance…

Tightening security

I’m not sure why is it that I started being so paranoid about security; quite a few things I’ve been changing in my workflow lately, and even though I kept a decent security of my systems and network, now I’m going one step further.

Beside working on getting Kerberos-strengthened NFS working (and trying to get libvirt if it wasn’t for gtk-vnc and the other mess), I’m now considering something to strengthen the security of this laptop. Given what I’ve seen with pam_mount, it would also make sense to get that improved, fixed, maybe even integrated with the pambase, as usual for me.

But beside not actually having an idea of how to configure that up yet, it also made me think of the use case for it. Let’s say I actually encrypt the whole partition (I know there are a few options for not using an entirely encrypted partition, but since last I checked they all require patched kernels, I’d like to stay quite away from those); it gets mounted when I login (on GDM), and up to that it’s okay, but what happens when I close the laptop in suspend? It wouldn’t get unmounted because all the process are still there. And if somebody can get a new login with that, well, you’re screwed because the other sessions can see the mounted partition as well.

One option I can think of is one old friend of mine: pam_namespace. This module allows to “split” the mount namespace of user login sessions at PAM login; placed before pam_mount, it would let the partition to appear mounted for all the processes starting from the process calling the module. What this can actually achieve is that even if you have root password, and create a new session with its credentials.. the partition will not appear to be mounted at all. Cool, but pam_namespace breaks a bunch of things such as HAL. It was almost exactly one year ago that I wrote about that.

Another option is to simply logout before suspending the laptop; this should also fix the graphic card reset problems: shut down X before suspending, reopen it with a new login afterwards. It take a bit more to reopen everything of course, but that’s not the main problem — it wouldn’t be a problem at all if software actually restarted as intended, like if Gnome actually restored the session and included Chromium tabs.

Unfortunately, I actually got one good reason to think that there is some trouble for what concerns this idea. One of the first incarnations of the tinderbox I found out that it actually let some stray processes; and this was by just executing a console-only chroot, as root, and without any desktop system software running. I’m quite sure that at least the GnuPG and SSH agent software is kept running at the end of the session. Such stray processes would still make it impossible to unmount the partition.

Finally the last remaining solution is to turn off the whole system, but as you probably already know that it takes time for a cold start to work out properly.

What options are there for these situations? Anybody have suggestions? I wouldn’t mind even just using an encrypted directory tree, mounted via FUSE, and encrypting with GnuPG (and thus, with my FSFe Fellowship Smartcard).

A similar, lower-priority but maybe even of more long-term importance is encrypting my backup data; in this case, I cannot be there to input the password over and over again, so I have to find a different solution. One thing I actually thought of is to make a (sane) use of the “software protection hardware keys” that I remember from computer magazines of the final ‘90s. There is actually a manufacturer of those not far from where I live, in the same city as my sister; I wouldn’t mind buying a sample USB key from them, and if they give out the details for communicating with it, implement an open source library to talk with that and see if I can make use it as encryption key for a whole partition.

At any rate, any suggestion so that I don’t have to reinvent, or at least redocument, the wheel, is as usual very welcome.

GMail backup, got any ideas?

I’ve been using GMail as my main email system for quite a few years now, and it works quite well; most of the times at least. While there are some possible privacy concerns, I don’t really get most of them (really, any other company hosting my mail will have similar concerns; I don’t have the time to manage my own server, and the end result is, if you don’t want anybody but me to read your mail, encrypt it!). Most of my problems related to GMail are pretty technical.

For instance, for a while I struggled with cleaning up old cruft from the server, removing old mailing list messages, old promotional messages or the announcements coming from services like Facebook, WordPress.com and similar. Thankfully, Jürgen wrote for me a script that takes care of the task. Now you might wonder why there is the need for a custom script to handle deletion of mail on GMail, given it uses the IMAP protocol… the answer is that even though it uses the IMAP interface, the way they store the messages makes it impossible to use the standard tools. The usual deletion scripts you may find for IMAP mailboxes set the deleted flag on and then expunge the folder… but that’s just going to archive the messages on GMail, you got to move the messages to the Trash folder… whose path depends on the language you set on GMail’s interface.

Now the same applies to the problem of backup: while I trust GMail will be available for a long time, I don’t like the idea of having no control whatsoever on the backup of my mail. I mean complete backup. Backup of the almost 3GB of mail that GMail is currently storing for me. Using standard IMAP backup software will probably require me something like 6GB of storage, and transfer as well! The problem is that all the content available in the “Labels” (which are most definitely not folders) is duplicated in the “All Mail” folder.

A proper GMail-designed interface would only fetch the content of the messages from the “All mail” folder, and then just use the message IDs to file the messages with the correct label. An even better software would allow me to convert the backed-up mess into something that can be served properly by an email client or an IMAP server, using Maildir structures.

it goes without saying that it should work incrementally: if I run it daily I don’t want it to fetch 3GB of data every time; I can bear with it the first time, but that’s about it. And it should also rate-limit itself to avoid hitting the GMail lockdown for possible abuse!

As far as I can see, there is no software to do that, I most definitely have no time to work on it… does anybody feel like doing so, or to find me the software I’m looking for? No in this case I most definitely don’t intend using proprietary software, no matter how handy it is: it’s going to handle very sensitive information, like my GMail password, and that’s not something that I’d keep available to a software I can’t look at the sources of.

Health, accounting and backups

For those who said that I have anger management issues regarding my last week’s post I’d like to point out that it’s actually a nervous breakdown that I got, not strictly (but partly) related to Gentoo.

Since work, personal life, Gentoo and (the last straw) taxes all merged this week, I ended up having to take a break from a lot of stuff; this included putting on hold for the week all kind of work, and actually spend most of my time making sure I have proper accounting, both for what concerns my freelancer activity, and home expenses (this is getting particularly important because I’m almost living alone – even if I technically am not – and thus I have to make sure that everything fits into the budget). Thankfully, GnuCash provides almost all the features I need. I ended up entering all the accounting information I had available, dating back to January 1st 2009 (my credit card company’s customer service site hasn’t worked in the past two weeks — since it’s the subsidiary of my own bank, I was able to get the most recent statements through them, but not the full archive of statements since issuing of the cards, which is a problem to me), and trying to get some data out of it.

Unfortunately, it seems like while GnuCash already provides a number of reports, it does not have the kind of reports I have, such as “How much money did the invoices from 2009 consists of?” (which is important for me to make sure I don’t go over the limit I’m given), or “How much money did I waste in credit card interests?”… I’ll have to check out the documentation and learn whether I can make some customised reports that produce the kind of data I need. And maybe there’s a way to set the term of payments that I have with a client of mine (30 days from the end of the month the invoice was issued in… which means if I issue the invoice tomorrow, I’ll be paid on May 1st).

On a different note, picking up from Klausman’s post I decided to also fix up my backup system, which was, before, based on single snapshots of the system on external disks and USB sticks; and moved to use a single rsnapshot system to back everything up in a single external disk, from the local system, the router, the iMac, the two remote servers, and so on. This worked out fine when I tried again the previous eSATA controller I had, but unfortunately it again failed (d’oh!) so I fell back to Firewire 400 but that’s way too slow for rsnapshot to do a full backup hourly. I’m thus trying to find a new setup for the external disk. I’m unsure whether to look up a FireWire 800 card or a new eSATA controller. I’m not sure about Linux’s support for the former though; I know that FireWire used to be not too well maintained, so I’m afraid it might just go down to FireWire 400, which is pointless. I’m not sure about eSATA because I’m afraid it might not be the controller’s fault but rather a problem with (three different kind of) disks or the cables; and if the problem is in the controller, I’m afraid about the chip on it; the one I have here is a JMicron-based controller, but with a memory chip that is not flashable with the JMicron-provided ROM (and I think there might be a fix in there for my problem) nor with flashrom as it is now.

So if you have to suggest an idea about this I’d be happy to hear of it; right now I only found a possibly interesting (price/features) card from Alternate (Business-to-business) “HighPoint RocketRAID 1742” which is PCI-based (I have a free PCI slot right now, and in case I can move it to a different box that has no PCI-E), and costs around €100. I’m not sure about driver support for that though, so if somebody have experience about it, please let me know. Interestingly enough my two main suppliers in Italy seem to not have any eSATA card, and of course high-grade, dependable controllers aren’t found at the nearest Saturn or Mediamarkt (actually, Mediaworld here, but it’s the very same thing).

Anyway, after this post I’m finally back to work on my job.

Stash your cache away

While I’m now spending a week out of my home (I’m at my sister’s family place, while she’s at the beach), I still be working, and writing blog posts, and maybe taking care of some smaller issues in Gentoo. I’m just a bit hindered becaues while I type on the keyboard I often click something away with the trackpad; I didn’t think about getting a standalone keyboard. I guess if somebody would want to send my way an Apple bluetooth keyboard I wouldn’t be saying no.

While finally setting up a weekly backup of my /home directory, yesterday, I noticed quite a few issues with the way software makes use of it. The first thing of course was to find the right software to do the job; I opted for a simple rsync in cron, after all I don’t care much about having incremental multiple backups a-la Time Machine, having a single weekly copy of my basic data is good enough.

The second problem was that, some time ago, I found that having a 4GB USB flash drive was enough if I wanted to copy the home, but when I looked at it yesterday, I found it being well over 5GB. How did that happen? Some baobab later, I find the problems. From one side, my medical records, (over 500 pages) scanned with a hi-grade all-in-one laser printer (no, not by me at home), are too big. They might have been scanned as colour documents (they are photocopies, so that’s not really right) or they might be at huge resolution, I have to check that, since having over half a gig of just printed records is a bit too much for me (I also have another full CD of CT scan data).

The second problem is that a lot of software misuses my home by writing down cache and temporary files in it rather than in the proper locations. Let me explain: if you need to create a temporary file or socket to communicate between different software in the same host, rather than writing it to my home, you should probably use TMPDIR (like a lot of software, fortunately, does). The same goes if you write cache data, and yes I’m referring to you, Evolution and Firefox, but also to Adobe Flash, Sun JDK and IcedTea.

Indeed, the FreeDesktop specifications already provide an XDG_CACHE_DIR variable that can be used to change the place where cache data should be saved, defaulting to ~/.cache, and in my system set to /var/cache/users/flame. This way, all the (greedy) cache systems would be able to write as much data as they want, without either wasting my space on the backup flash, or forcing me to write them to two disks (/var/cache is in a sort-of throwaway disk).

For now I resolved by making some symlinks, hoping they keep stable, and creating a ~/.backup-ignore file, akin to .gitignore with the paths to the stuff that I don’t want backed up. The only problem I really have is with evolution because that one has so many subdirectories and I can’t really understand what I should backup and what not.

Oh and there are a few more problems there: the first is that a lot of software over the past two years migrated from just the home to ~/.config but the old files were kept around (nautilus is an example) and a few directories contained very very old and dusty session data that wasn’t cleared up properly.

Providing too many configuration options to tell where the stuff is, can definitely lead to bad problems, but using the right environment variable to decide where stuff should go and where it should be looked up at, can definitely solve lots of your problems!

Questing for the guide

I was playing some Oblivion while on the phone with a friend, when something came up to my mind, related to my recent idea of an autotools guide . The idea came up in my mind by mixing Oblivion with something that Jürgen was saying this evening.

In the game you can acquire the most important magical items in three ways: you can find them around (rarely), you can build them yourselves (by hunting creatures’ souls), you can pay for them with gold, or you can get them during quests. The latter are usually the most powerful but it’s not always true. At any rate, the “gold” option is rarely the one used because it’s a somewhat scarce resource. You might start to wonder what this has to do with the autotools guide that I’ve made public yesterday, but you might also have already seen where I’m going.

Since I’m the first one to know that money, especially lately, is a scarce resource, and that, me first, I’m the kind of person who’s glad to put in an effort with a market value three/four times more than whatever money I could afford to repay a favour, it would be reasonable for me to provide a way of “payment” through use of technical skills and effort.

So here is my alternative proposal: if you can get me a piece of code that I failed to find and I don’t have time to write, releasing it under a FOSS license (GPLv2+ is very well suggested; compatibility with GPL is very important anyway), and maintaining it until it’s almost “perfect”, I’ll exchange that for a comparable effort in extending the guide.

I’ll post these “quests” from time to time on the blog so you can see them and see whether you think you can complete them; I’ll have to find a way to index them though, for now it’s just a proposal so I don’t think I need to do this right away. But I can drop two ideas if somebody has time and is willing to work on them; both of them relate to IMAP and e-mail messages, so you’ve been warned. I’m also quite picky when it comes to requirements.

The first, is what Jürgen was looking at earlier: I need a way to delete the old messages from some GMail label every day. The idea is that I’d like to use GMail for my mailing lists needs (so I have my messages always with me and so on), but since keeping the whole archive is both pointless (there is gmane, google groups, and the relative archives) and expensive (in term of space used in the GMail IMAP account and of bandwidth needed to sync “All Mail”, via UMTS), I’d like to just always keep the last 3 weeks of e-mail messages. What I need, though, is something slightly more elaborated than just deleting the old messages. It has to be a script that I can run on a cron job locally, and connects to the IMAP server. It has to allow deleting the messages completely from GMail, which means dropping them in the Trash folder (just deleting them is not enough, you just remove the label), and emptying it too; it also has to be configurable on a per-label basis of time to keep the messages (I would empty the label with the release notifications every week rather than every three weeks), and hopefully be able to specify to keep unread messages longer, and consider flagged messages as protected. I don’t care much about implementation language but I’d frown up at things “exotic” like ocaml, smalltalk and similar since it would require me to install their environment. Perl, Python and Ruby all are fine, and Java is too since the thing would run just once a day and is not much of a slowdown to start the JVM for that. No X connection though.

The second is slightly simpler and could be coupled with the one before: I send my database backups from the server to my GMail e-mail address, encrypted with GPG and compressed with BZip2, and then split in message-sized chunks. I need a way to download all the messages and reassemble the backups, once a week, and store it on a flash card, using tar directly on it like it was a tape (no need for a filesystem should reduce the erase count). The email messages have the number of the chunk, the series of the backup (typo or bugzilla) and the date of backup all encoded in the subject. More points if it can do something like Apple’s Time Machine to keep backups each day for a week, each week for a month (or two) and then a backup a month up to two years.

So if somebody has the skill to complete these tasks and would be interested in seeing the guide expanded, well, just go for it!

My take on compression algorithms

Biancospino - Hawthorn

I just read Klausman’s entry about compression algorithms comparison, and while I’m no expert at all in the field of compression algorithms, I wanted to talk a bit about it myself, from a power user point of view.

Tobias’s benchmarks are quite interesting, although quite similar in nature to many others you can find out there comparing lzma to gzip and bzip2. One thing I found nice for him to explicit is that lzma is good when you decompress more than compress. This is something a lot of people tend to skip over, causing some quite catastrophic (in my view) results.

Keeping this in mind, you can see that lzma is not really good when you compress as many times (or more) than you compress. When would that happen is the central point of this. You certainly expect a backup system to compress a lot more than decompress, as you want to take daily (or more frequent) backups, but the hope is never to need to restore one of those. For Gentoo users, another place where they compress more than decompress is for manpages and documentation. They are compressed every time you merge something, but you don’t tend to read all the manpages and all the documentation every day. I’m sure most users don’t ever read most of the documentation that is compressed and installed. Additionally, lzma does not seem to perform just as good on smaller files, so I don’t think it’s worth the extra time needed to compress the data.

One thing that Tobias’s benchmark has in common with the other benchmarks about lzma I’ve seen is that it doesn’t take much into consideration the memory usage. Alas, valgrind removed the massif graph that gave you the exact memory footprint of a process, it would have been quite interesting to see those graphs. I’d expect lzma to use a lot more memory than bzip2, to be so quick in decompression. This would make it particularly bad on older systems and embedded use cases, where one might be interested to save flash (or disk) space.

For what concerns GNU choice of not providing bzip2 files anymore, and just providing gzip or lzma compressed tarballs, I’m afraid that the choice has been political as much as technical, if not more. Both zlib (for gzip) and bzip2 (with its libbz2) have very permissive licenses, and that makes them ideal even for proprietary software, or free software with, in turn, permissive licenses like the BSD license. lzma-utils is still free software, but with a more restrictive license, LGPL-2.1.

While LGPL still allows proprietary software to link, dynamically, the library, it still is more restrictive, and will likely turn away some proprietary software developers. I suppose this is what the GNU project wants anyway, but I still find it a political choice, not a technical one. Also, it has an effect on users, as one has to either use the bigger gzip version or also install lzma-utils to be able to prepare a GNU-like environment on a proprietary system, like for instance Solaris.

I’m sincerely not convinced by lzma myself. It takes way too much time during compression to find it useful for backups, which is my main compression task, and I’m uncertain about its memory use. The fact that bsdtar doesn’t support it yet directly is also a bit of a turn down for me, as I grow used not to have three processes for extracting a tarball. Doug’s concerns about the on-disk format also makes it unlikely for me to start using that.

Sincerely, I’m more concerned with the age of tar itself, while there are ways to add stuff to tar that it wasn’t originally designed for, the fact that to change it you have to fully decompress it and then re-compress it makes it pretty much impossible to use as a desktop compression method like the rar, zip and ace (and 7z, somewhat, as far as I can see you cannot remove a file from an archive) formats are on Windows. I always found it strange that the only widespread archive method supporting Unix information (permissions, symlinks and so on) is the one that was used for magnetic tapes and is thus sequential by nature…

Well having it sequential makes it more interesting for backing up on a flash card probably (and I should be doing that by the way), but I don’t see it much useful to compress a bunch of random files with data on them… Okay that one of the most used cases for desktop compression has been compressing Microsoft Office’s hugely bloated files, and that both OpenOffice and, as far as I know, newer Office versions use zip files to put their XML into, but I still can see a couple of things I could be using a desktop compression tool from time to time…

Logins and GnuPG

This post will not touch Gentoo at all in its topic, but these considerations do stem out of a Gentoo-related problem. The problem relates on our infrastructure system, not like there are problems with Infra, rather a problem of us developers, but let me explain.

We usually login to the various infrastructure boxes like dev.gentoo.org (webspace and other), the CVS and the GIT servers through the use of SSH keypairs, without a password. This is basically a generic method to keep boxes secure still allowing external access to them. But we also have a password set in LDAP that is used for the mail and to set LDAP data. One of the things I find most useful as a non-recruiter developer is being able to look up the IM addresses of the devs who made them available to other devs. I use Jabber a lot, especially since it allows me to avoid IRC.

As we rarely use that password, you can easily expect that a good deal of us forget that password quite easily. I asked already twice in three years for that to be reset (to my defense, it wasn’t even set the first time). Now to get it reset we have to ask someone, like Robin, who has to do the stuff by hand. I wondered how it can be safely automated. We have SSH and PGP keypairs, they could just as well work.

This in turn made me wonder how much safe are some services’ logins. I often forget some passwords, so it happens that I ask for a replacement, it arrives to my mail, and then I trash the mail for safety. But what if the mail was encoded with GnuPG? Then I’d need my keypair to decode it, and I can trust to leave it on the server. You could also use it to avoid phishing: make the outgoing service mail to be GPG-signed.

I tried something like that before in PHP, but it wasn’t really simple because you either had to leave the secret key without a passphrase, or you had to hardcode the passphrase inside the source (or configuration) files, which is not a good idea.

Sincerely I wonder if there is any software out there that does use GnuPG in a non-interactive way, beside simple scripts. Of the latter I have an example handy. The whole database of this blog as well as of xine Bugzilla is dumped every night, then compressed and encrypted with my GPG public key, the result is then sent directly to my email address, where I store them (I actually have to write another script to fetch one backup every week and write it off on a CF card, using tar directly on the device, without any filesystem, it should limit the deletion, after all it was designed for magnetic tapes, and the limitations are almost the same between the two).

It would be quite nice if w could easily let all the sensitive information encrypted on the mail server. Unfortunately using GMail through WebMail ruins the whole idea. Luckily, they do offer IMAP and POP3 which make using GnuPG quite friendlier.

Another webcomic… and nostalgia

Not nostalgia of Gentoo this time. Today, while reading my daily Irregular Webcomic strip, I seen the link for The Noob… I followed it mostly because I like to inspect new webcomics from time to time, it’s something that relaxes me.

Well, I since read all of it… I loved it, as I’m an ex-Ultima OnLine junkie. Yes, I had this horrid secret kept in me… well not much of a secret, as you can easily see that I was for a while a member of NoX-Wizard development team (where I met Luxor and Kheru .. hi guys, I know you still read my blog ;) ), and it’s easy to connect my current nick with the LordDGP that used to be the forum admin for Dragons.it, and Dragons’ Land.. and from that to GM Unicorn is an easy step.

I have to say that Ultima OnLine helped me a lot; Dragons’ Land was a family to me when I was a young nerd with no friends, Dragons.it was the first time I worked as a sysadmin (and a BOFH from time to time), and allowed me to start looking at Apache, MySQL, PHP and other stuff which now is daily bread to me.

I still got quite a few good friends that I met during those years (which are, for who’s curious, set about 7 years ago), and I have a lot of good memories… and it made me nostalgic, especially knowing that if it wasn’t for Dragons’ Land, I wouldn’t probably be here… in many ways.

When I started this blog entry, I was hoping to find some old fragment of memory from my backups, but unfortunately the most likely backup set carrying the data I need is compressed with xar (1.2) and the current version (1.4) doesn’t like to extract it. I’ll have to see if code for 1.2 is still around and if I can extract the data with that. Suits me well for trying an experimental new format rather than going with the old but gold tar.

This makes me wonder: do you guys usually try to consolidate backups? Taking very old backup CDs, copying all its content minus stuff like old programs you don’t need anymore, and then throwing away the old backup?