Not happy about audio players

I’m really not happy with the state of audio players in Linux; not at all. And this is not for missing support of mp3 or AAC files, that is not a problem in Gentoo since we don’t really end up in the fight over patents, as a source distribution, but rather for the fact that each player I’ve tried lately has one or more problems that make it very unfriendly to me.

Starting with mpd, that can be a mess to deal with on systems where PulseAudio is used, because of the privilege separation (that I find quite silly to be honest — I understand if you run it on just a server that it’ll have to run under a particular user, but if I want to run it on a desktop, why cannot it run as my user?), continuing with Rhythmbox (that does not work when shuffling songs) and finishing with Banshee, cmus and mocp (no, a file with m4a extension does not have to be AAC-encoded!).

I’m not really sure what’s up with Rhythmbox: if I set it on the “Music” entry, set shuffle and play, at the end of a song it notifies me of the next one but does not start playing it, and I have to explicitly request a new one or pause/play it (which is quite a problem given I use music to avoid burning out).

What really upsets me, though, are the three players that don’t play my ALAC (Apple Lossless Audio Codec) files. Oh yes now you can blame me for using some patent-ridden codec (although I don’t think ALAC has any patent on it, it’s pretty similar to FLAC — there are yet no encoders for iton Linux, that’s true, but it’s the only lossless format that I can play on my iPod, and I like my music lossless). The point is, GStreamer can play the files fine, although with a catch – the mp4 demuxer that can open m4a files is in the gst-faac package, while the ALAC decoder comes with gst-ffmpeg – and Rhythmbox also can play them correctly; Banshee on the other hand reports them as not valid because it expects m4a files to only carry AAC streams. Similarly, cmus states to support mp4 files but it only supports m4a/aac, it really lacks an ALAC decoder so I cannot blame it too much (it’s also the one that I was liking best). Finally mocp upset me as much as Banshee: it uses FFmpeg for decoding, which means it can decode ALAC files… but it also expects that the file contains an AAC stream.

Now I can understand that for the average Joe user the extension of a file declares its content (especially those with Windows background), but I’d expect developers of audio players to have a clue about multimedia file formats. Although I should have known it by the fact that for about two year the shared mime database reports wrong mime types

The mad muxer

I have expressed my preference for the MP4 format some time ago, which was to be meant mostly over the Ogg format from Xiph, most of the custom audio container formats like FLAC and WavPack (to a point; WV’s format could be much worse), and of course the AVI format. Although I do find quite a few advantages of MP4 over Matroska, I don’t despise that format at all.

The main problem I see with Matroska, actually, is the not so widely available implementation; I cannot play a Matroska video on either my cellphone (which, if somewhere were to think about it, is a Nokia E71 right now, not an iPhone, and will probably never be, I have my limits!), the PlayStation3 or the PSP. Up to last month, I couldn’t play it with my AppleTV either, which was quite a bit of a problem. Especially considering a lot of Anime fansubbers use that.

Now, since I was actually quite bored by the fact that, even though I was transcoding them, a lot of subtitles didn’t appear properly, I decided to try out XBMC; it was quite a pleasing experience actually, the Linux-based patch stick works like a charm, without going too much in the way of infringing Apple’s license as far as I can see, and the installation is quite quick. Unfortunately the latest software update on AppleTV (2.3.1) seems to have broken xbmc, which now starts no longer in fullscreen but rather in a Quartz window in the upper half of the screen.

So I ditched also XBMC (for now) for a derivative, boxee which, while being partially proprietaryware, seems to be a little more fine-tuned for AppleTV; it’s still more free software than the original operating system. Both XBMC and Boxee solve my subtitles problem since they both have a video calibration setup that allows to tell the software how much overscan the LCD panel is doing, and which pixel size ratio it has. Quite cool, Apple should really have done that too.

Also, I started using MediaTomb to serve my content without having to copy it on the AppleTV itself; this is working fine since I’m using ethernet-over-powerline adapters to connect the office with my bedroom, and thus there is enough bandwidth to stream it over. Unfortunately, here starts the first problem: while I was able somehow to get XBMC to play AVI files with external .srt subtitles, it fails on Boxee. Since the whole thing is bothersome anyway, I wanted to try an alternative: remux the content without re-encoding it, in Matroska Video files, with the subtitles embedded in them as a third track.

This seems to work fine from an AVI file, but fails badly when the source is an MP4 file, the resulting file seems corrupted with MPlayer and crashes Totem/GStreamer (that’s no news, my dmesg output fills easily with Totem’s thumbnailer’s crashes when I open my video library). Also, I have been yet unable to properly set the language of the tracks, which would help me to have the jap-sub-eng setup automatic on XBMC. If somebody knows how to do that properly, I’d be glad.

Anyway, there it goes another remuxing of the video library…

Encoding iPod-compatible audiobooks with Free Software

Since in the last few days I’ve been able to rest also thanks to the new earphones I’ve finally been able to think again of multimedia as well as Gentoo. But just to preserve my sanity, and to make sure I do something I can reuse to rest even better, I decided to look into something new, and something that I would like to solve if I could. Generating iPod-compatible audibook files from the BBC Radio CDs I got.

The audiobook that you buy from the iTunes Store are usually downloaded in multiple files, one per CD of the original CD release, sometimes with chapter markings to be able to skip around. Unfortunately they also are DRM’d so analysing them is quite a bit of a mess, and I didn’t go to much extent to identify how that is achieved. The reason why I’d like to find, or document, the audibook format is a two-fold interoperability idea. The first part is being able to play iPod-compatible audiobooks with Free Software with the same chapter marking system working, and the other is (to me more concerning to be honest) being able to rip a CD and create a file with chapter markings that would work on the iPod properly. As it is, my Audiobooks section on the iPod is messed up because, for instance, each piece of The Hitchhiker’s Guide To The Galaxy, which is a different track on CD, gets a single file, which is thus a single entry in the Audiobooks series. To deal with that I had to create playlists for the various phases, and play them from there. Slightly suboptimal, although it works.

Now, the idea would be to be able to rip a CD (or part of a CD) in a single M4B file, audiobook-style, and add chapter markings with the tracks’ names to make the thing playable and browsable properly. Doing so with just Free Software would be the idea. Being able to have a single file for multiple CDs would also be of help. The reason why I’m willing to spend time on this rather than just using the playlists is that it seems to me like the battery of the iPod gets consumed much sooner when using multiple files, probably because it has to seek around to find them, while a single file would be loaded incrementally without spending too much time.

In this post I really don’t have much in term of ideas about implementation; I know the first thing I have to do is to find a EAC -style ripper for Linux, based on either standard cdparanoia or libcdio’s version. For those who didn’t understand my last sentence, if I recall correctly, EAC can also produce a single lossless audio file, and a CUE file where the track names are timecoded, instead of splitting the CD in multiple files per track. Starting from such a file would be optimal, since we’d just need to encode it in AAC to have the audio track of the audiobook file.

What I need to find is how the chapter information is encoded in the final file. This wouldn’t be too difficult, since the MP4 format has quite a few implementations and I already have worked on it before. The problem is that, being DRM’d, analysing the Audiobooks themselves is not the best idea. Luckily, I remembered that there is one BBC podcast that provides an MP4 file with chapter markings: Best of Chris Moyles Enhanced which most likely use the same feature. Unfortunately, the mp4dump utility provided by mpeg4ip fails to dump that file, which means that either the file is corrupt (and how does iTunes play that?) or the utility is not perfect (much more likely).

So this brings me back to something I was thinking about before, the fact that we have no GPL-compatible MP4-specific library to handle parsing and writing of MP4 files. The reason for this is most likely the fact that the standards don’t come cheap, and that most Free Software activists in the multimedia area tend to think that Xiph is always the answer (I disagree), while the pragmatic side of the multimedia area would just use Matroska (which I admit is probably my second best choice, if it was supported by actual devices). And again, please don’t tell me about Sandisk players and other flash-based stuff. I don’t want flash-based stuff! I have more than 50GB of stuff on my iPod!

Back to our discussion, I’m going to need to find or write some tool to inspect MP4 files, I don’t want to fix mpeg4ip because of MPL license it’s released under, and I also think the whole thing is quite overengineered. Unfortunately this does not really help me much since I don’t have the full specs of the format handy, and I’ll have to do a lot of guessing to get it to work. On the other hand, this should be quite an interesting project, for as soon as I have time. If you have pointers or are interested in this idea, feel free to chime in.

My stake about iTunes, iPod, Apple TV and the like

I’ve been asked a few times why do I ever use an Apple TV to watch stuff on my TV, and why I’m using an iPod and buying songs from the iTunes store. Maybe I should try to write down my opinion on the matter, which is actually quite pragmatic, I think.

I like stuff that works. Even though AppleTV requires me some fiddling, once it got the videos in, it works. And I can be assured that if I get in my bed, I can watch something, or listen to something, without further issues. Of course it has to get the stuff right first.

The iPod lasts almost a week without charging, and I listen to it almost every night, it plays my music in formats that I can deal with, on Linux, just fine: the very common AAC and the ALAC format (Apple Lossless Audio Codec). FFmpeg plays ALAC; xine and mpd use FFmpeg. And with a container that I don’t dislike. Sure it really could use some more software implemented to deal with it on Linux, like an easy way to get the album art out of it (mpd does not seem to get that), and some better tagging too; I guess I could just buy the PDFs of the standard and try to implement some library to deal with it, or extend libavformat to do that).

I have most of my music collection ripped off the original CDs I have here. I used to have it in FLAC (even though I find its container a bit flakey), then I moved to wavpack which had a series of advantages but still used a custom container format. A few months ago I moved everything to ALAC instead, having a single copy of everything and having it in a container format that is a standard (even though a bit of a hard one).

As for what concerns the iTunes (Music) Store, I’m really happy that Apple is improving it and removing the DRM, even if it means that some songs will cost more than they do now. So you cannot use it from Linux because it only works with iTunes, but the music in the “Plus” format, without DRM, work just as fine under Linux, which is basically the only thing I care about. I’d sincerely be glad to buy TV Series on there if they were without DRM and in the usual compatible format; unfortunately this does not seem to be the case, yet. I bite the bullet with audiobooks, mostly because they are at an affordable price even though they are locked in. This is mostly a pragmatic choice.

Sure I’d love if it had a web-based interface that wouldn’t require me to use iTunes to buy the songs, but it works just as fine to me as it is now, since the one alternative that everybody told me when I was looking for one was Amazon’s MP3 Store. But that does not work where I live (Italy), while the iTunes Store does. What I totally don’t agree with is the people who scream to privacy breach because of the watermarking of the music files bought from the iTunes store. Sure there is my name and my ID in the file that I downloaded, but why should I care? The file is supposed only to be used on my systems, isn’t it? I can run it on any device I own, as long as it can understand the format, and I can re-encode it on a different format for devices that don’t use that. It’s not supposed to be published I’m sure, but the only place where having those data is a problem is usually for music piracy. Which by the way is not much hindered since it’s not too difficult to just get the data out. DRM bad. Watermarking no.

On the other hand, I really cannot get on the Xiph train with Ogg, Theora and Vorbis. Sure they are open formats and all that but the fact they aren’t really working on higher end devices makes them just vendor lock-ins just as bad as DRM is, in my opinion. Since even the patent-freeness of those formats is not entirely clear yet (beside the fact that nobody challenged it for now), I don’t see the point in having my music stored in a format that my devices can’t play just for the sake of it. But, I guess, I live somewhere in the world where this is still sane enough to be dealt with.

All in all, I’d be very glad if Apple extended even more the coverage of Japanese music, since paying customs for it is pretty bad and I cannot find most of the artists I’m interested in here in Italy otherwise.

And before I’m misunderstood, I’m not trying to just do advertising for Apple, I’m just saying that pragmatically I don’t count them off just because they sell proprietary software, beside the fact as you can probably tell by other posts in my blog, I tend to use or learn from their open source pieces too. I just grow tired of people saying that one should stay away from the ITunes Store because of DRM (which is going away) or watermarking (which is a good thing in my opinion).

I bought a software license

I finally decided to convert my video library to Mpeg4 containers, H.264 video and AAC audio, rather than mixing and matching what I had before that. This is due to the fact that I hardly use Enterprise to watch video anymore. Not only because my office is tremendously hot during the summer, but more because I have a 32” TV set in my bedroom. Nicer to use.

Attached to that TV set there is an Apple TV (running unmodified software 2.0.2 at the moment) and a PS3. If you add to that all the other hardware that can play video I own, the only common denominator is H.264/AAC in MP4 container. (I also said before that I like the MP4 format more than AVI or Ogg). It might be because I do have a few Apple products (iPod and AppleTV), but also Linux handle this format pretty well, so I don’t feel bad about the choice. Beside, new content I get from youtube (like videos from Beppe Grillo’s blog) are also in this format — you can get them with youtube-dl -b.

Unfortunately, as I discussed before with Joshua, and as I tried last year before the hospital already, converting video to this format with Linux is a bit of a mess. While mencoder has very good results for the audio/video stream conversions, producing a good MP4 container is a big issue. I tried fixing a few corner cases in FFmpeg before, but it’s a real mess to produce a file that QuickTime (thus iTunes, and thus the Apple TV) can accept.

After spending a couple more days on the issue I decided my time is worth more than what I’ve been doing, and finally gave up to buy a tool that I have been told does the job, VisualHub for OSX. It was less than €20, and that is usually what I’m paid by the hour for my boring jobs.

I got the software, tried it out, the result was nice. Video and audio quality on par with mencoder’s but a properly working MP4 container that QuickTime, iTunes, AppleTV, iPod and even more importantly xine can play nicely. But the log showed a reference to “libavutil”, which is FFmpeg. Did I just pay for Free Software?

I looked at the Bundle, it includes a COPYING.txt file which is, as you might have already suspected, the text of GPL version 2. Okay, so there is free software in here indeed. And I can see a lot of well-known command line utilities: lsdvd, mkisofs, and so on. One nice thing to see is, though, an FFmpeg SVN diff. A little hidden, but it’s there. Good.

The doubt then was if they were hiding the stuff or if it was shown and I did just miss it. Plus it has to have the sources of everything, not just a diff of FFmpeg’s. And indeed in the last page of the documentation provided there is a link to this that contains all the sources of the Free software used. Which is actually quite a lot. They didn’t limit themselves to take the software as it is though, I see at least some patches to taglib that I’d very much like to take a look to later — I’m not sharing confidential registered-users-only information by the way, the documentation is present in the downloadable package that acts as a demo too.

I thought about this a bit. They took a lot of Free Software, adapted it, written a frontend and sold licenses for it. Do I have a problem with this? My conclusion is that I don’t. While I would have preferred is they made it more clear on the webpage that they are selling a Free Software-based package, and that they would have made the frontend Free Software too, I think they are not doing anything evil with this. They are playing well by the rules, and they are providing a working software.

They are not trying to exploit Free Software without giving anything back (the sources are there) and they did something more than just package Free Software together, they tested and prepared presets to use for encoding for various targets, included Apple TV which is my main target. They are, to an extent, selling a service (their testing and presets choices), and their license is also quite acceptable to me (it’s like a family license, usable on all the household’s computers as well as a work computer in an eventual office).

At the end of the day, I’m happy of spending this money as I suppose it’s also going to further develop the Free Software part of the software too, although I would have been happier to chip in a bit more if it was fully Free Software.

And most importantly, it worked out of the tarball solving me a problem I was having for more than an year now. Which means, for me, a lot less time spent trying to get the whole thing working. Of course if one day I could just do everything with simply FFmpeg I’ll be very happy, and I’ll dedicate myself a bit more on MP4 container support, both in writing and parsing, in the future, but at least now I can just feed it the stuff I need converted and dedicate my time and energy toward more useful goals (for me, as in paid jobs, and for the users with Gentoo).

A nice productive day

I might be sick, or just crazy, or both of them, but I still think I’m quite more productive when I have fever, or the days around that time. Yesterday I had fever, and I was knock out till late afternoon, but then I started feeling better, and I started producing.

First of all, rbot’s init script in my overlay has been updated: Subversion trunk will now create a file inside the bot’s directory without need for start-stop-daemon to create it itself, which means I can finally let rbot fork on its own rather than forcing it to background (which is not really a good thing to do, if it can be avoided). Thanks to tango and jbn for looking after my raw patch.

Then I decided to finish the work with apcupsd; again in my overlay you can find a new ebuild for 3.14.0 based on the one found on bugzilla, but with a new apccontrol file and a totally renewed init script. This script can be multiplied, which means you can have a /etc/init.d/ link, which would then start apcupsd looking for /etc/apcupsd/foo.conf configuration file. Thanks to apcupsd authors for implementing this feature in 3.14!

I’ve also attached the two UPSes to Farragut instead of Enterprise, as the latter is not a server and might as well be offline when Farragut is still up (for instance this is the case most of the times I’m outside for the whole day, or if there is noone at home); apcupsd on FreeBSD works nicely, and doesn’t require any fiddling with configuration, neither kernel side (the ugen support is built by default) nor with permissions (as the default is to run as root, this might change in the future, but as it is it’s fine to me; I’ll be working on a better handling of permissions on device nodes for Gentoo/FreeBSD, but it’s not in my priority list at the moment). Also here, they work perfectly fine.

I was also able to fix a bug in xine-lib, with mp4/mov files playback that used version 1 rather than version 0 of the media header atom, such as files generated by FFmpeg. The bug was reported on sourceforge already but I wasn’t sure what it actually meant and where to find a sample file; when I generated the same condition by chance here, I decided to take a deeper look; unfortunately MultimediaWiki doesn’t provide much information about that, but I asked Mike to give me an account there so I can try to write something useful, maybe next time someone else needs a mdhd atom description they won’t have to look at the sources of FFmpeg to see how it’s read and generated.

Then tonight I wanted to resume my work on implementing audio conversions inside the audio output loop instead of doing it for every decoder; it’s an hard work as it probably will require rewriting a good deal of code, but it should be rewarding once it’s done. Right now there are a bunch more of flag values for capabilities, so for instance I can say if a drivers supports integer or float 32-bit samples, 64-bit samples, and if it can accept streams in a different endianness. This is important because there’s little point in doing the job of the output plugin, that might handle that transparently, for instance a big-endian stream might be decoded on a little-endian machine, then sent through PulseAudio to a big-endian machine where it will be reproduced: in this setup, xine’s endian reversal of the stream (from big to little endian) would have been superfluous, as PulseAudio would have accepted the big-endian samples, then sent them to the other machine that needed not to reverse them to reproduce them.

Anyway, right now the code is quite fragile, there’s no conversion being done, there are mostly only things that are totally broken out, there are asserts 1 == 0 used to mark the code that needs to be rewritten. But something works: I was able to remove a lot of code from the musepack decoder, as libmpcdec always produces 32-bit native-endian (or maybe little-endian, I’m not yet sure) floating point samples; previously the decoder converted all the samples back to 16-bit format, and then gave it to the audio output loop to handle… now instead it sends them directly to the output, and as PulseAudio supports 32-bit float samples, they are not converted and play back fine.

Tomorrow I’ll see to work a way to handle upsamping and downsampling of streams, the problem is that it’s not trivial to decide what to do: if a plugin supports 32-bit integer samples, but not 24-bit integer samples, it should probably upscale the 24-bit to 32-bit to avoid losing precision; if it doesn’t it might upscale it to 32-bit float, or maybe downscale it to 16-bit integers. The same applies to channel mode, if the driver doesn’t support stereo output, should it be updated to 4.0 or should it be downgraded to mono?

For sure this time I’m very happy of being working on branches: leaving the code broken for weeks, maybe months, is not something you want to do on the main development branch. And I mean it, because with the changes I’m doing, not only I’ll be changing the ABI of the library itself (well, actually not much, just a couple of structures), but more importantly I’ll be changing the audio output plugins API, as I need to feed them a sample format rather than a bits-per-sample size.

Anyway, this is not going to be something easy to complete, but it will be a noticeable improvement for Amarok users once done, especially because I want to make sure that the capabilities for “mixer” volume and “PCM” volume are cleared up, probably by deprecating one of them, so that Amarok can be changed not to use xine’s software amplification (which also sucks and I also need to rewrite in good part in this branch) if the output plugin actually supports a per-stream volume (like PulseAudio).

Sponsoring, bribing, and comfort words are welcome, as xine-lib’s audio_out code is giving me creeps.