Free Idea: Free Software stack for audiobooks

This post is part of a series of free ideas that I’m posting on my blog in the hope that someone with more time can implement. It’s effectively a very sketched proposal that comes with no design attached, but if you have time you would like to spend learning something new, but no idea what to do, it may be a good fit for you.

This is clearly not a new idea, as I posted about something very similar over eight years ago. At the time I was looking for a way of encoding audibooks coming from audio CD in a format that was compatible with the iPod Classic. Since then, Apple appears to have done their best to make the audiobooks experience on iOS the worst possible, to the point that I don’t really use my iPod Touch as my primary audiobook player any more.

As an aside to the free idea, which can probably give a bit more context for you all, let me describe the problems I have with the current approach to audiobooks by Apple. A few iOS major versions ago, they decided to move the audiobooks handling from the Music app to the iBooks app; this would be reasonable, given that they are books, and it was always a bit strange to have them in a separate application, but it also meant you lost the ability to build playlists with them.

Playlists with audiobooks are great, because they allow you to “stitch” multiple books of the same series, so that you can play them for hours on end, for instance if you need them to sleep. I used to have a playlist for the Hitchhikers’ Guide to the Galaxy radio series and one for the books, one for Dresden Files, and one for the News Quiz, including both the collected editions in CD by BBC, my own “audiobooks” built out of the podcasts, and the more recent podcast episodes that I have not collected into audiobook files yet.

So what is the idea? There are two components that, as far as I can see, are currently heavily lacking in the FLOSS world. The first is a way to generate audiobook files, which is what I complained eight years ago. Indeed, if you look even at a random sample on Project Gutenberg, the audiobook is actually a ton of files (47!) each with a chapter in them. A proper audiobook file would be a single file, with chapter markers, and per-chapter metadata (chapter title, and in that case, the performer).

It’s more than just a matter of having a single file to move around. While of course the hardware improvements made a number of these points moot, the original reason to have a single big file over multiple small files was to avoid having to seek to a different point in the disk in-between chapters. It also allows the decoder to keep going, between chapters, as there is no “end of stream” but rather just a marker that at a given point in time some different metadata applies. Again, as I said this is no longer as relevant as it used to be, but it’s also not entirely gone.

The other component that is currently lacking, is a good playback solution. While VLC can obviously play those files right now, and if I’m not mistaken it also extracts the per-chapter metadata correctly, it lacks two features that make enjoying audiobooks possible. The first is possibly complicated, and relates to the ability to store bookmarks and current-playing time. While supposedly VLC supports the feature for resuming from last playback, I have heard it’s still sometimes unreliable (I have no idea how it’s implemented), plus it does not support just bookmarking a given time in a file/book. Bookmarking is particularly important when listening to non-novel audiobooks, as you may want to go back to it afterwards, to re-listen to advice or take a reference to further details.

The other feature is basically UI heavy, and it involves mostly the mobile UI (at least the Android one) and is the ability to scan backward and forward in the file. You have probably seen this in other players including Netflix’s own app, that allow you to scan back 30 seconds — in audiobooks it’s also useful to scan forward 30 seconds, particularly when considering the bookmarks above.

As usual for Free Ideas I have no time to work on this myself. I can give the idea details out, and depending on things I may be able to contribute to a bounty on it, but otherwise, no code I can share about this yet.


This past weekend I had the honor of hosting the VideoLAN Dev Days 2014 in Dublin, in the headquarters of my employer. This is the first time I organize a conference (or rather help organize it, Audrey and our staff did most of the heavy lifting), and I made a number of mistakes, but I think I can learn from them and be better the next time I’ll try something like this.

Photo credit: me

Organizing an event in Dublin has some interesting and not-obvious drawbacks, one of which is the need for a proper visa for people who reside in Europe but are not EEA citizens, thanks to the fact that Ireland is not part of Schengen. I was expecting at least UK residents not to need any scrutiny, but Derek proved me wrong as he had to get an (easy) visa at entrance.

Getting just shy of a hundred people in a city like Dublin, which is by far not a metropolis like Paris or London would be is an interesting exercise, yes we had the space for the conference itself, but finding hotels and restaurants for the amount of people became tricky. A very positive shout out is due to Yamamori Sushi that hosted the whole of us without a fixed menu and without a hitch.

As usual, meeting in person with the people you work with in open source is a perfect way to improve collaboration — knowing how people behave face to face makes it easier to understand their behaviour online, which is especially useful if the attitudes can be a bit grating online. And given that many people, including me, are known as proponent of Troll-Driven Development – or Rant-Driven Development given that people like Anon, redditors and 4channers have given an even worse connotation to Troll – it’s really a requirement, if you are really interested to be part of the community.

This time around, I was even able to stop myself from gathering too much swag! I decided not to pick up a hoodie, and leave it to people who would actually use it, although I did pick up a Gandi VLC shirt. I hope I’ll be able to do that at LISA as I’m bound there too, and last year I came back with way too many shirts and other swag.

Afterthoughts on the VideoLan Dev Days and FOMS 2012

I’ve spent the first weekend of September in Paris, as j-b organized the yearly VideoLan Dev Days un-conference. I’m happy to have been there, because it was definitely great to be around all the friends and colleagues, and work on libav and discuss what we also need to work on.

What hasn’t been entirely that great, unsurprisingly, has been being around the Google people — both at VDD and at FOMS. The main problem with them is that they are never there to think that they can learn from the others, or that they can be wrong — the feeling that me and others got is that they came with all the answers and we all have to accept them and their options. This is obviously just a generalization — Andrew and Dale has been very pleasant to speak to, outside of the infamous talk which boiled down to “We can’t, or don’t want to, speak about this, but let us tell you again how nice we are for considering using your software and saying we won’t”.

Honestly, I’m surprised that Chrome works at all with the kind of attitude they have — I guess the answer is that the non-media people are saner or, simply, they know what they are doing, instead of just thinking they know what they are doing.

At any rate at least one good thing is coming out of this is that – also thanks to Hanno who pointed me to harvester – is that we’ll soon have a Planet Multimedia to aggregate blog feeds for people working on open multimedia projects — no matter what the project!

More posts on the various topics might come up depending on how much time I have for blogging over the next few weeks.

RTSP clients’ special hell

This week, in Orvieto, Italy, there was OOoCon 2009 and the lscube team (also known as “the rest of the feng developer beside me”) was there to handle the live audio/video steaming.

During the preparations, Luca called me one morning, complaining that the new RTSP parser in feng (which I wrote almost single handedly) refused to play nice with the VLC version shipped with Ubuntu 9.04: the problem was tracked down to be in the parser for the Range header, in particular in the normal play time value parsing: the RFC states that I’m expecting a decimal value with a dot (.) as the separator, but VLC is sending a comma (,) which my parser is refusing.

Given Luca actually woke me up while I was in bed, it was a strange presence of mind that let me ask him which language (locale) was the system set in: Italian. Telling him to try using the C locale was enough to get VLC to comply with the protocol. The problem here is that the separators for decimal places and thousands are locale-dependent characters; while most programming languages obviously limit themselves at supporting the dot, and a lot of software likewise use that no matter what the locale is (for instance right now I have Transmission open and the download/upload stats use the dot, even though my system is configured in Italian). Funny that this problem came up during an OpenOffice event, given that’s definitely one of the most known software that actually rely (and sometimes messes up) with that difference.

To be precise, though, the problem here is not with VLC by itself: the problem is with the live555 (badly named media-plugins/live in Gentoo) library, which provides the generic RTSP code for VLC (and MPlayer). If you ever wrote software that dealt with float to string conversion you probably know that the standard printf()-like interface does not respect locale settings; but live555 is a C++ library and it probably uses string streams.

At any rate, the bug was known and fixed already in live555, which is what Gentoo already have, and the contributed bundled libraries of VLC have (for the Windows and OS X builds), so those three VLC instances are just fine, but the problem is still present in both the Debian and Ubuntu versions of the package which are quite outdated (as xtophe confirmed). Since the RFC does not have any conflicting use of the comma in that particular place, given the extension of the broken package (Ubuntu 9.10 also have the same problem), we decided for working it around inside the feng parser, and accepting the comma-separated decimal value instead.

From this situation, I also ended up comparing the various RTSP clients that we are trying to work with, and the results are quite mixed, which is somewhat worrisome to me:

  • latest VLC builds for proprietary operating systems work fine (Windows and OS X);
  • VLC as compiled in Gentoo also work fine, thanks Alexis!
  • VLC as packaged for Debian (and Ubuntu) uses a very old live555 library; the problem described here is now worked around, but I’m pretty sure it’s not the only one that we’re going to hit in the future, so it’s not a good thing that the Debian live555 packaging is so old;
  • VLC as packaged in Fedora fails in many different ways: it goes in a loop for about 15 minutes saying that it cannot identify the host’s IP address, then it finally seem to be able to get a clue, so it’s able to request the connection but… it starts dropping frames, saying that it cannot decode and stuff like that (I’m connected over gigabit lan);
  • Apple’s QuickTime X is somewhat strange; on Merrimac, since I used it to test the HTTP tunnel implementation it now only tries connecting to feng via HTTP rather than using RTSP; this works fine with the branch that implements it but fails badly in master obviously (and it doesn’t look like QuickTime gets the hint of changing to RTSP protocol); on the other hand it works fine on the laptop (that has never used the tunnel in the first place), where it uses RTSP properly;
  • again Apple’s QuickTime, this time on Windows, seems to be working fine.

I’m probably going to have to check the VLC/live packaging of other distributions to see how many workaround for broken stuff we might have to look out for. Which means more and more virtual machines, I’ll probably have to get one more hard drive by this pace (or I could probably replace one 320G drive with a 500G drive that I still have at home…). And I should try totem as well.

Definitely, RTSP clients are a hell of a thing to test.

Server Software Design

When I started to help Luca with feng I hadn’t worked on server software for many years; while server software was what I started working on when I started seriously working on collaborative free software projects, it really was much lower profile, yet I see that some of the things I did there seem to reflect here.

One of this is my (bad) habit of rewriting lots of parts together when I see that something does not really comply with the specifications; in this case while trying to make the parser more robust, I found that the whole thing works out of luck and started rewriting it again. The nice part is that in two days I wrote (again) the whole RTSP parser (before I only rewrote the actual parser but not the logic that reads the data), with much less lines of code, with per-client worker threads, with SCTP support and so on.

The problem now is, though, that we haven’t been able to release a stable 2.0 release yet, and since 2.0 was supposedly still using bufferpool, we’re probably just going to skip it and go with 2.1; on the other hand, the worker threads are still experimental and should not be in 2.1 either, but rather in 2.2 (I need the new parser to implement the proxy-passthrough used by QuickTime and QTSS, that encapsulates RTSP and RTP over HTTP). So the worker threads will be postponed and looked at in the future; on the other hand, the new parser was also designed to solve a series of security concerns I had with the previous parser code.

Even more complex, the message parser I wrote as my first rewrite of the request parsing and handling was split off in a separate library (related to this earlier idea I had), but the new parser is much more tied to feng’s code and cannot be easily shared directly, so the library has to go away. But does it make any sense to have a 2.1 release using a new library that the 2.2 release will blow away? Nope, so there I go merging it back.

And since I’m at that point, and merging is hard.. I’m basically working on making sure that all the non-feature non-rewrite changes that we got are merged in for 2.1, so that the walk toward 2.2 is a cakewalk. Not an easy thing but I should be able to pull it off with not an overly complex work.

The other problem is implementing IPC between two components of the lscube stack, flux and feng, to provide live streaming support. Now this is trickier because we’re actually mixing different programming languages: flux is written in C++ while feng is written in good old C. The idea for now is to use POSIX message queues to send data from one to the other, but here comes the problem: earlier the IPC mechanism was provided by the(pretty broken) bufferpool library, and thus lied away from either feng or flux; now that library is no longer present, which means we’ll have to find another way to deal with it. I’d probably look into installing some development header and libraries for feng.

I also have to design a stress-test suite to make sure that there are no buffer overflow in the old parser, which is not an easy thing to do actually; thinking outside of the box, which is what is needed for security testing, is not exactly what I do to have a standard-compliant server.

Oh yeah and there is the problem of standards. While rewriting the seeking support for feng with Luca, we discovered that either live555 or VLC are pretty stupid when it comes to seeking, and send multiple PLAY requests instead of a series of PAUSE/PLAY pairs, which in RTSP actually have a very different meaning as they create an edit list.

And all what I described up to now only relates to RTSP/1.0, not even to extensions (beside the QT/QTSS I noted before), nor to RTSP/2.0. Seems like we still have a long road ahead before even completing the basic features, or being standard compliant (did I say that we also don’t send a Connection: Close header when we’re going to close the RTSP connection?). Alas.

On the other hand I have to say that the experience is proving very interesting, and I guess that once the server part is decently ready and we can start focusing on the client part, I might be able not only to get xine in a state where the RTSP code does not decide to security mess-up every other year, and it actually works with more than just Real’s own servers.

The big harddisk D’OH!

Seems like my luck with harddisks lately is bad, does me right for buying Maxtor, I suppose. Today while building GCC on Klothos (the Ultra5), I hit some (many) bad blocks in the middle of it, which is kinda bad… I remembered smartmontools to report something about this before, but I remembered just 4 blocks being unreadable.. now it’s more like 40, which means the disk is degrading.

Unfortunately, all the spares I’m left with at home are the two SATA drives I didn’t put on Farragut, but while I do have a SATA PCI controller, and the Ultra5 has space for it, I’m told it won’t boot from that… so I’m stuck again waiting for some third hand parts to star the box up again, unless I use another box for netbooting.

Sigh, I was also going to test and keyword my usual stuff (zile, zsh, sudo, unieject I did already but for instance I didn’t try nopaste yet), waiting for the nullmodem cable to start trying to get the alignment bug in the kernel. Now I’ll have to wait for a while for that too.

I think I’m pretty depressed, I pass most of my days lately waiting for more specs and requirements from my employers, I’m trying to find something that’s both useful to others and relaxing for me, but that starts to be always harder… especially considering I still receive hatemail for XMMS removal (but nobody seem to consider the idea of doing anything constructive.. even upstream developers seems just to be a bunch of wankers, reading their comments… too bad they didn’t try to take over GTK+1 yet), and that I’m tired to hear comments like “you shouldn’t be that harsh about it” from fellow developers (who also did never volunteer to maintain the thing before). Things have been growing boring and bothering me lately, but I suppose that’s just a reflection of the whole XMMS thing, and that it will be better next week after the thing is eradicated from the tree altogether.

On the other hand, VLC 0.8.6_beta2 (upstream’s -test2a) is in portage, this time keyworded ~x86-fbsd too. The nsplugin thing has still to be reviewed, as the way it’s handled changed upstream, and my patch does not make much sense anymore, but I don’t feel like tampering with that at the moment.

I also finally added kdnssd-avahi to the tree, but that will require a re-keywording of kdelibs that will take a some time (I’m currently building the stuff on Prakesh for ~x86-fbsd keywording, if it works), so you might not expect its effect to apply that soon.

Sigh, I suppose I could use a little break, and a fresh start next week, maybe by changing something in my room… or I can try to change style for the windows and the wallpapers.. it’s two months I have NASA photos as wallpaper on the two screens.

I’ve joined the Gentoo/FreeBSD/SPARC64 team

Or I’m just about to join it. Yesterday the Ultra 5 I ordered on eBay arrived.. took me a while to clean up the polystyrene, but after that it was simple to test it out.

The box has a sexy look, I have to admit that much, but as I expected its disk is tiny (8.6 GB), so I replaced it with the one on Odissey (sorry this ended up removing my installation of Syllable, but I’ll try to get a new disk for that box sooner or later), but for one thing and the other, yesterday I didn’t have time to continue and start installing the box, but today I then started, although I had a couple of problems, first because the nvram had the CD-Rom device set on IDE Secondary Master, while for me it was Primary Slave, then because the original CD reader (an LG CRD-8322b, that’s an interesting model for me, because it was my first spare part, I bought such a CD reader for my Pentium 133 when the original one broke) didn’t support CD-RW (and I mostly use CD-RWs for LiveCDs Install CDs and so on..), so I took also the DVD reader from Odissey.

Now I’m building the kernel, but before I can actually start working on the kernel bug, I need a nullmodem cable, unfortunately I never had one, and the shops here around don’t have them anymore.. and of course all my serial cables aren’t the kind you can open and solder as you need. I’ll probably go to the nearest components’ shop to buy the cable and the connections, I’ll then be soldering it myself, the do it yourself approach will also amuse me, as I like doing stuff with the solder iron.

I’m impressed with this Ultra5 box, really. The chassis is sleek and easy to fit even in small spaces, the support for serial consoles is something I miss on the other boxes, OpenBoot is an interesting tool too, surely better than the classic PC BIOS (I’m curious to look at EFI to be honest), and even to replace the internal parts, it’s easier than with some old IBM box I used to tinker with :/ The Sun keyboard is pretty impressive too, even if it’s old, it has good typing, better than my previous keyboard, and the presence of the extra keys to the left is something that just now keyboard producers learnt from.

Considering the coolness of this box, naming it after a Klingon ship (because it’s something I bought second hand, it’s not a Federation ship ;) ) seems to be the only choice, so it’s now Klothos.

I’m not really sure which kind of memory should I put on that box though, or I would have tried adding something to it, even after the bad experience with Odissey (I’m now using the burnt memory module as bookmark for The Firm – thanks Tony!).

Oh yeah, thanks to the VideoLan guys, I’m now preparing the ebuild for VLC 0.8.6-test2a, they are always nice guys when it comes to release :)

Cannot sleep…

… well not a news, everybody who followed my blog in the past months knows that I suffer from insomnia from time to time. Tonight I wasn’t expecting it, tho.

So, as I’m here, I want to at least get a bit of updates on what I’m doing and why :P

First of all, last night new minor versions of libtorrent and rtorrent were released, you can find 0.10 and 0.6 in portage now already, and they seem to work good too. After this release, I also asked for stable on the versions, and deleted all the other lowe versions but the current stable, so that I can reduce the number of ebuilds on CVS and at the same time provide a suitable stable ebuilds (rtorrent and libtorrent proved completely stable for me up to now).

Today I also revbumped (again, I know) vlc to fix a problem with the newly-readded nsplugin, now it should work fine. I also took the time with that to change one “little not so little” thing: FFmpeg now is mandatory, no more ffmpeg useflag, as VLC is kinda pointless without that. Also the next time I’ll be bumping xine-lib I’ll make external FFmpeg mandatory, as it’s proving stable and I can make sure the version in portage is always working with the xine-lib version we have.

You might also see that I’m trying to fix dependencies on virtual/x11 to ‘<virtual/x11-7’, as suggested by Jakub, to try to avoid problems when users have virtual/x11-7 merged (luckily, I don’t ;).

On the Gentoo/FreeBSD front, I gave up on dbus for now, I’ll work on that when i have more time, and I also gave up with lua because the current versions are evil and they are now in mask for soooo much time that I don’t think it’s worth spending time on that. I keyworded Firefox 1.5 and Seamonkey tho (and it’s just me who find the monolithic Seamonkey more performant than Firefox that was created to be faster than the old monolithic Mozilla? If I’m not the only one, I can start to say Mozilla people screwed up big time with that).

The ruby-hunspell bindings are working, but I haven’t written any documentation about them yet so they are still lingering around in GIT rather than being released. I hope to do soon. I should also update the website, and that’s a bit annoying from my point of view.

I’m thinking of updating the stage for Gentoo/FreeBSD too, as there are quite a few things that changed, starting from GCC and Binutils versions, that are now respectively 4.1.1 and 2.17. I should really do that.
But I’m having trouble with the crosscompiler right now and this ain’t good at all, so I have to resolve that first (so that building the stage won’t take two days).

Least thing I’ll say, is that net month I’ll probably take two weeks off, for basically the first time since I joined Gentoo last year! Not a big deal, no holidays, nothing really relaxing, I’ll just go to stay at my sister’s place for two weeks while her husband is out of town for his job, but I’ll be unavailable for the two weeks as I’m not going to have a net connection nor anything else. I’ll be reading my books, and trying to apply the Qigong exercise I read about, that are not easy to actually do daily as long as I’m at home.

And there it is…

Okay, I’ve committed vlc 0.8.5-r4 in Portage, with the restored nsplugin useflag and a new seamonkey useflag. By default it links against Firefox, and it’s done with it.
If you enable seamonkey useflag, it enables seamonkey, of course. The patch I’ve prepared does not apply to the current SVN code, but I do have one that applies, I just need to clean it up a bit and then send it upstream (I already shown it to xtophe on #videolan).

Now, just to let people know, begins the hard part: I’m not a Mozilla user, I use Konqueror daily, I use Camino on the iBook rarely, but mostly I use Safari there too (when I’m on OSX, that is, else I use Konqueror), and sometimes I use Opera if I need 32-bit support.

This means that I won’t be able to help if there are problems with the plugin itself, that will be probably marked as UPSTREAM wither for VLC or for Mozilla.. or will ask someone else to co-maintain it. So if any dev would like to help out with VLC, I’m all and ready to hand it over, even entirely if needed ;)

And now to proceed with my Radeon adventure, this box seems to have now –12°C when compared to the other day, and I’d be ready to bet the card change is responsible. Not a bad thing at all!

Okay, let’s see what else I have on my TO DO list for tonight. Sigh.

Working on getting VLC ready…

… for Firefox 1.5, that is.

So, I have to thank again a lot Patrick (bonsaikitten) for allowing me use his new server to work on this, as I didn’t want to merge Firefox and Seamonkey on my box :) I’m currently working on a screen session with emacs and the internal terminal, it’s a bit of a mess, and I can’t use colours, but it’s fine for what I have to do.

The first thing I want to say, is that it’s not an easy fix. The current code that checks for Mozilla support in VLC is quite complex, as it can be done either automatically, if mozilla-config command is found, or by providing a manual path to the Mozilla SDK. Unfortunately gecko-sdk is now obsolete, and thus it’s quite a problem.

What I had to do is to add a third option, that uses pkg-config to identify the paths and the commands. Unfortunately this is not easy to do as it’s easy to say, especially since I’m not a Mozilla expert.

A part from that, the time required for autoretooling (don’t ask me, but VLC does not like to just run autoconf) and then to rebuild half of its code to tell me if the plugin can be built isn’t that short, even when using the remote box, so it does take much of my time away.

Now, working on this ended up talking with Darren Salt of gxine, so it’s likely I’ll be working on that to make it work with Firefox 1.5 and Seamonkey, which at the end of the day make this conversion a quite big deal for me, and I’m not even a Mozilla user ;)

On the other hand, another little glitch with the new video card and Radeon driver: if there is one application using an Xv port (like Kaffeine closed and in the tray icon), I cannot open it with something else, be it another xine or tvtime. I think I’ll investigate that further in the next days.

Oh, while I was writing this entry VLC finally finished building, and yes, I do have the plugin linking against Firefox 1.5, yai! Next step is trying to get it build against seamonkey.