Encoding iPod-compatible audiobooks with Free Software

Since in the last few days I’ve been able to rest also thanks to the new earphones I’ve finally been able to think again of multimedia as well as Gentoo. But just to preserve my sanity, and to make sure I do something I can reuse to rest even better, I decided to look into something new, and something that I would like to solve if I could. Generating iPod-compatible audibook files from the BBC Radio CDs I got.

The audiobook that you buy from the iTunes Store are usually downloaded in multiple files, one per CD of the original CD release, sometimes with chapter markings to be able to skip around. Unfortunately they also are DRM’d so analysing them is quite a bit of a mess, and I didn’t go to much extent to identify how that is achieved. The reason why I’d like to find, or document, the audibook format is a two-fold interoperability idea. The first part is being able to play iPod-compatible audiobooks with Free Software with the same chapter marking system working, and the other is (to me more concerning to be honest) being able to rip a CD and create a file with chapter markings that would work on the iPod properly. As it is, my Audiobooks section on the iPod is messed up because, for instance, each piece of The Hitchhiker’s Guide To The Galaxy, which is a different track on CD, gets a single file, which is thus a single entry in the Audiobooks series. To deal with that I had to create playlists for the various phases, and play them from there. Slightly suboptimal, although it works.

Now, the idea would be to be able to rip a CD (or part of a CD) in a single M4B file, audiobook-style, and add chapter markings with the tracks’ names to make the thing playable and browsable properly. Doing so with just Free Software would be the idea. Being able to have a single file for multiple CDs would also be of help. The reason why I’m willing to spend time on this rather than just using the playlists is that it seems to me like the battery of the iPod gets consumed much sooner when using multiple files, probably because it has to seek around to find them, while a single file would be loaded incrementally without spending too much time.

In this post I really don’t have much in term of ideas about implementation; I know the first thing I have to do is to find a EAC -style ripper for Linux, based on either standard cdparanoia or libcdio’s version. For those who didn’t understand my last sentence, if I recall correctly, EAC can also produce a single lossless audio file, and a CUE file where the track names are timecoded, instead of splitting the CD in multiple files per track. Starting from such a file would be optimal, since we’d just need to encode it in AAC to have the audio track of the audiobook file.

What I need to find is how the chapter information is encoded in the final file. This wouldn’t be too difficult, since the MP4 format has quite a few implementations and I already have worked on it before. The problem is that, being DRM’d, analysing the Audiobooks themselves is not the best idea. Luckily, I remembered that there is one BBC podcast that provides an MP4 file with chapter markings: Best of Chris Moyles Enhanced which most likely use the same feature. Unfortunately, the mp4dump utility provided by mpeg4ip fails to dump that file, which means that either the file is corrupt (and how does iTunes play that?) or the utility is not perfect (much more likely).

So this brings me back to something I was thinking about before, the fact that we have no GPL-compatible MP4-specific library to handle parsing and writing of MP4 files. The reason for this is most likely the fact that the standards don’t come cheap, and that most Free Software activists in the multimedia area tend to think that Xiph is always the answer (I disagree), while the pragmatic side of the multimedia area would just use Matroska (which I admit is probably my second best choice, if it was supported by actual devices). And again, please don’t tell me about Sandisk players and other flash-based stuff. I don’t want flash-based stuff! I have more than 50GB of stuff on my iPod!

Back to our discussion, I’m going to need to find or write some tool to inspect MP4 files, I don’t want to fix mpeg4ip because of MPL license it’s released under, and I also think the whole thing is quite overengineered. Unfortunately this does not really help me much since I don’t have the full specs of the format handy, and I’ll have to do a lot of guessing to get it to work. On the other hand, this should be quite an interesting project, for as soon as I have time. If you have pointers or are interested in this idea, feel free to chime in.