Archiving

One of the requests at the past VDD when I shown some VLC developers my Ruby-Elf toolsuite was for it to access archive files directly. This has been in my wishlist as well for a while, so I decided to start working on it. To be precise, I started writing the parser (and actually wrote almost all of it!) on the Eurostar that was bringing me to London from Paris.

Now you have to understand that while the Wikipedia page is a quite good source of documentation for the format itself, it’s not exactly complete (it doesn’t really have to be). But it’s mostly to the point: there are currently two main variants for ar files: the GNU and the BSD variants. And the only real difference between the two is in the way long filenames are handled. And with long filenames I mean filenames that are long at least 16 characters.

The GNU format handles this with a single index file that provides the names for all the files, and provide you with an offset instead of the proper name in the header data, whereas the BSD format provides you with a length, and prepend the filename to the actual file data. Which of the two options should be the best is well up for debate.

I already knew of the difference, so I did code in support for both variants, but of course while on the train I only had access to the GNU version of ar which is present in Binutils, so I only wrote a test for that. Now that I’m back at the office (temporarily in Los Angeles, as it seems like I’ll be moving to London soon enough), I have a Mac to my side and I decided to prepare the files for testing with its ar(1) which is supposedly BSD.

I say supposedly because something strange happened! The long filename code is hit by two of my testcases: one is using an actual object file which happens to have a name longer than 16 characters, the other is an explicit long (very long) filename — 86 characters! But what happens is that Apple’s version of ar writes the filename as 88 characters, padding it with two final null bytes. At first I thought I got something wrong in the format, but if I use bsdtar on Linux, which provide among other formats support for the bsd ar format, it writes down properly 86 bytes without any kind of null termination.

More interestingly, the other archive, where the filename is just 20 characters long, is written the exact same way by both libarchive and Apple’s ar!

Exit mobile version