The overengineering of ALSA userland

This is a bit of an interesting corner case of a rant. I have not written this when I came up with it, because I came up with it many years ago when I actively worked on multimedia software, but I have only given it in person to a few people before, because at the time it would have gained too much unwanted attention by random people, the same kind of people who might have threatened me for removing XMMS out of Gentoo so many years ago. I have, though, spoken about this with at least one of the people working on PulseAudio at the time, and I have repeated this at the office a few times, while the topic came up.

For context you may want to read this rant from almost ten years ago by Mike Melanson, who was at the time working for Adobe on Flash Player for Linux. It’s a bit unfortunate that the drawings from the post are missing (but maybe Mike has a copy?) You can find the missing drawing on the Internet Archive as well, but the whole gist is that the Linux Audio API were already bloody confusing at the time, and this was before PulseAudio came along to stay. So where are we right now?

Well, the good news is that for the most part things got simpler: aRTs and ESounD are now completely gone, eradicated in the favour of PulseAudio, which is essentially the only currently used consumer sound daemon. Jack2 is still the standard for the pro-audio crowd, but even those people seem to have accepted that multimedia players are unlikely to care for it, and it should be limited to proaudio software. On the kernel driver side, the actually fairly important out-of-kernel drivers are effectively gone, in favour of development happening as a separate branch of the Linux kernel itself (GIT was not a thing at the time, oh how things have changed!) and OSS is effectively gone. I don’t even know if it’s available in the kernel, but the OSS4 fanboys have been quiet for long enough that I assume they gave up too.

ALSA itself hasn’t really changed much in all this time, either in the kernel or as userland. In the kernel, it got more complex for supporting things like jack sense, as HDA started supporting soft-switching between speaker and headphones output. In the userland, the plugins interface that was barely known before is now a requirement to properly use PulseAudio, both in Gentoo and in most other distributions. Which effectively makes my rant not only still relevant, but possibly more relevant. But before I go into details, I should take a step back and explain what the whole thing with userland and drivers is, with ALSA. I’ll try to simplify the history and the details, so if you know this very well you may notice I may skip some details, but nobody really cares that much about those.

The ALSA project was born back when Linux was in version 2.4 — and unlike today, that version was the version for a long time. Indeed up until version 3.0, a “minor” version would just be around forever; the migration from 2.4 to 2.6 was a massive amount of work and took distributions, developers and users alike a lot of coordination. In Linux 2.4, the audio drivers were based off the OSS interface, which essentially meant you had /dev/dspX and /dev/mixerX, and you were done — most of the time mixer0 matched a number of dspX devices, and most devices would have input and output capabilities, but that’s about all you knew. Access to the device was almost always exclusive to one process, except if the soundcard had multiple hardware mixer channels, in which case you could open the device multiple times. If you needed processes to share the device, your only option was to use a daemon such as the already named aRTs or ESounD. The ALSA project aimed to replace the OSS interface (that by then became a piece of proprietary software in its newer versions) with a new, improved interface in the following “minor” version (2.5, which stabilized as 2.6), as well as on the old one through additional kernel modules — the major drawback from my point of view, is that this new interface became Linux-specific, while OSS has been (and is) supported by most of the BSDs as well. But, sometimes you have to do this anyway.

The ALSA approach provides a much more complex device API, but mostly for good reason, because sound cards are (or were) a complex interface, and are not consistent among themselves at all. To make things simpler to application developers who previously only had to use open() and similar functions, ALSA provided an userland library, provided in a package called alsa-lib, but more often known as its filename: libasound. While the interface of the library is not simple either, it does provide a bit of wrapping around the otherwise very low-level APIs. It also abstracts some of the problems away of figuring out which cards are present and which mixer refers to which. The project also provided a number of tools and utilities to configure the devices, query for information or playback raw sound — and even a wrapper for applications implementing OSS access only, in the form of a preloadable library catching accesses to /dev/dsp to convert them to ALSA API calls — not different from the similar utilities provided by arts, esd or PulseAudio.

In the original ALSA model, access to the device was still limited to one process per channel, but as soundcards with more than one hardware channel became quickly obsolete (particularly as soundcard kind-of standardized over AC’97, then HDA) the need for sharing access arose again, and since both arts and esd had their limits (and PulseAudio was far from ready), the dmix interface arrived — in this setup, the first process opening the device would actually have access, as well as set up a shared memory area for other processes to provide their audio, which then would be mixed together in userland, particularly in the process space of the first process opening the device. This had all sorts of problems, particularly when sharing across users, or when sharing with processes that only used sound for a limited amount of time.

What dmix actually used was the ability of ALSA to provide “virtual” devices, which can be configured for alsa-lib to see. Another feature that got more spotlight thanks to the lowering of featureset in soundcards, particularly with the HDA standard, is the ability to provide plugins to extend the functionality of alsa-lib — for a while the most important one was clearly the libsamplerate-based resampling plugin which almost ten years ago was the only way to provide non-crackling sound out of an HDA soundcard. These plugins included other features, such as a plugin providing a virtual device for encoding to Dolby AC3 so that you could us S/PDIF pass-through to a surround decoder. Nowadays, the really important plugin is the one PulseAudio one, which allows any ALSA-compatible application to talk to PulseAudio, by configuring a default virtual device.

Okay now that the history lesson is complete, let me see to write down what I think is a problem with our current, modern setup. I’ll exclude in my discussion proaudio workstations, as these have clearly different requirements from “mainstream” and most likely would still argue (from a different point) that the current setup is overengineered. I’ll also exclude most embedded devices, including Android, since I don’t think PA ever won over the phone manufacturers outside of Nokia — although I would expect that a lot of them actually do rely on PulseAudio a bit and so the discussion would apply.

In a current Linux desktop, your multimedia applications end up falling into two main categories: those that implement PulseAudio support and those that implement ALSA support. They may use some wrapper library such as SDL, but at the end of the day, these are the two APIs that allow you to output sound on modern Linux. A few rare cases of (proprietary, probably) apps implementing OSS can be ignored, as they would either then use aoss or padsp to preload the right library to provide support to whichever stack you prefer. Whichever distribution you’re using all of these two classes of apps are extremely likely to be going out of your speaker through PulseAudio. If the app only support ALSA, the distribution is likely providing a configuration file so that the default ALSA device is a virtual device pointing at the PulseAudio plugin.

When the app talks to PulseAudio directly, it’ll use its API through the client library, that then IPCs through its custom protocol to the PulseAudio Daemon, which will then use alsa-lib through its API, ignoring all the virtual devices configured, which in turn will talk with the kernel drivers through its device files. It’s a bit different for Bluetooth devices, but you get the gist. This at first sight should sound just fine.

If you look at an app that only supports ALSA interfaces, it’ll use the alsa-lib API to talk to the default device, which uses the PulseAudio client library to IPC to the PulseAudio daemon, and so as above. In this case you have alsa-lib on both sides: the source application and the sink daemon. So what am I complaining about? Well here is the thing: the parts of ALSA that the media application uses versus the parts of ALSA that the PulseAudio daemon uses are almost entirely distinct: one only provides access to the virtual devices configured, and the other only gives access to the raw hardware. The fact that they share the API is barely a matter, in my opinion.

From my point of view, what would be a better solution would be for libasound to be provided by PulseAudio directly, implements a subset of ALSA API, that either show the devices as the sinks configured in PulseAudio or, PA wants to maintain the stream/sink abstraction itself, just a single device that is PulseAudio. No configuration files, no virtual devices, no plugins whatsoever, but if the application is supporting ALSA, it gets automatically promoted to PulseAudio. Then on the daemon side, PulseAudio can either fork alsa-lib, or have alsa-lib provide a simpler library, that only provides access to the hardware devices, and removes support for configuration files and plugins (after all PulseAudio already has its own module system.) Last I heard, there actually is an embedded version of libasound that implements only the minimal amount of features needed to access a sound device through ALSA. This not only should reduce the amount of “code at play” (pardon the pun), but also reduce the chance that you can misconfigure ALSA to do the wrong thing.

Misconfiguring ALSA is probably the most common reason for your sound not working the way you expect on Linux — the configuration files and options, defaults and so on kept changing, and since ten years ago things are so different that you’re likely to find very bad, old advise out there. And it’s not always clear not to follow it. And for instance for the longest time Adobe Flash, thinking of doing the right thing, would not actually abide to the default ALSA configuration, and rather try to access the hardware device itself (mostly because of nasty bugs with dmix), which meant that PulseAudio wouldn’t be able to access it anymore itself. The quickly sketched architecture above would solve that problem, as the application would not actually be able to tell the difference between the hardware device and the PulseAudio virtual device — the former would just not be there!

And just to close up my ALSA rant, I would like to remember you all, that alsa-lib still comes with its own LISP interpreter: the ALISP dialect was meant to provide even more configurability to the sound access interface, and most distributions, as far as I know, still have it enabled. Gentoo provides a (default-off) alisp USE flag, so you’re at least spared that part in most cases.

Update 2020-05-08: Adobe appears to have “archived” (or rather deleted) Mike’s blog from their site, some time in 2017. I’ve updated the link above to point at the Internet Archive (to which I recommend donating, if you can). While doing that I even managed to find a copy of the drawing that was missing when I originally wrote this post. Score!

License auditing your code

I’ve already said that I’m working on a new device, whose firmware is Gentoo-based, and it goes easily said that it’s a partly-closed source software. That’s just the way it is: while probably the most part of the software within the device is Free Software, the business logic is behind closed doors. People are getting used to this, and I don’t think it’s entirely a bad thing. I mean, we’re giving back to Free Software in this context in many ways: Luca is working on libav and Aravis me I’m working on Linux drivers and together we’re working on Gentoo so the environment as a whole is gaining something.

Of course, when you are dealing with this kind of devices, you have to take care of auditing the licenses of the software you’re bundling up in the firmware, which is what I’m doing today (well, yesterday for you who read me I guess). It’s not my first time at this game, and as usual my starting point is an UML package diagram.

For those not used to UML, a package diagram is a decent way to identify who makes use of what; thanks to the way the UML is specified you can use two different “stereotypes”, called import and access which show you the execution/linking boundary quite clearly. By giving each project a package, and each library within that package a subpackage, you can easily see how things are connected.

So while working through it, with the two objectives to both reduce the amount of software we had to install (I talked about that yesterday), and to verify that we don’t distribute our closed-source code linked to GPL libraries, I started noticing a few bad things; from one side the license identification in Gentoo is shabby, but that’s nothing new, as I write this, I’m fixing a few ebuilds that report the wrong license information, for instance; from the other side, we have packages like PulseAudio that does not let you understand their licensing in a very clear way.

In the case of PulseAudio, the LICENSE file tells you this:

All PulseAudio source files are licensed under the GNU Lesser General Public License. (see file LGPL for details)

However, the server side has optional GPL dependencies. These include the libsamplerate and gdbm (core libraries), LIRC (lirc module), FFTW (equalizer module) and bluez (bluetooth proximity helper program) libraries, although others may also be included in the future. If PulseAudio is compiled with these optional components, this effectively downgrades the license of the server part to GPL (see the file GPL for details), exercising section 3 of the LGPL. In such circumstances, you should treat the client library (libpulse) of PulseAudio as being LGPL licensed and the server part (libpulsecore) as being GPL licensed. Since the PulseAudio daemon, tests, various utilities/helpers and the modules link to libpulsecore and/or the afore mentioned optional GPL dependencies they are of course also GPL licensed also in this scenario.

[…]

Is this clear to you? It should be: libpulse is the library you implement a PulseAudio client with, and libpulsecore used to be the convenience library only used by the server… but in PulseAudio’s history, this hasn’t been the case for quite a while, with the result that libpulse requires libpulsecore, and that means that if you link GDBM into PulseAudio’s core library … you now have GPL’d libpulse.

This is not the case for all the libraries it uses though: for instance BlueZ is not loaded into the core library so you still only have a PulseAudio daemon GPL’d and not the libraries, as intended.

What’s the catch about this? Well, turns out that Nokia knew about this for a while, since they did contribute a “simple” database in alternative to GDBM (GPL-2) and TDB (GPL-3), which is fine for most embedded usage, if not for desktops — which is exactly what I need here.. of course the ebuilds still force GDBM enabled. I’m fixing that as well.

I’m leaving for later fixing the license specification for other USE flags, it’s a time constraint for now.

I guess that every time I do this I understand how difficult license auditing is, and why people don’t like having multi-license projects or even multiple licenses doing almost the same thing. Oh well.

Garbage-collecting sections is not for production

Some time ago I wrote about using --gc-sections to avoid unused functions to creep into final code. Today instead I’d like to show how that can be quite a problem if it was used indiscriminately.

I’m still using at least for some projects the diagnostic of --gc-sections to identify stuff that is unused but still kept around. Today I noticed one bad thing with it and pulseaudio:

/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.2/../../../../x86_64-pc-linux-gnu/bin/ld: Removing unused section '.data.__padsp_disabled__' in file 'pulseaudio-main.o'

The __padsp_disabled__ symbol is declared in main.c to avoid pulseaudio’s access to OSS devices to be wrapped around by the padsp script. When I first have seen this, I thought the problem was a missing #ifdef directive: if I didn’t ask for the wrapper, it might still have declared the (unused) symbol. That was not the case.

Looking at the code, I found what the problem was: the symbol is (obviously) never used by pulseaudio itself; it is, rather, checked through dlsym() by the DSP wrapper library. For this reason, for compiler and linker, the symbol looks pretty much unused, and when asking for it to be dropped explicitly, it is. Since the symbol is loaded via the runtime linker, neither building nor executing pulseaudio will have any problem. And indeed, the only problem would be when running pulseaudio as a children of padsp, and using the OSS output module (so not on most Linux systems).

This shows how just using -fdata-sections -ffunction-sections -Wl,--gc-sections is not safe at all and why you shouldn’t get excited about GCC and ld optimisations without understanding how they work in detail.

In particular, even I thought that it would be easier to work around than it actually seem to be: while GCC provides a used attribute that allows to declare a variable (or a function) as used even though the compiler can’t tell that by itself (it’s often used together with inline hand-written ASM the compiler doesn’t check for), this does not propagate to the linker, so it won’t save the section from being emitted. The only solution I can think of is adding one instruction that sets the variable to itself, but that’s probably going to be optimised away. Or giving a way for gcc to explicit that the section is used.

Fixing libtool

You might have noticed that Gentoo has been lacking libtool 2.2 support for quite a while; this has been quite a problem because libtool 2.2 among other things shorten the time needed for the execution of ./configure as it doesn’t unconditionally check for C++ and Fortran, and at the same time, quite a bit of packages have been using the new libtool now.

While a lot of packages using libtools were easily worked around by removing the libtool macro files and regenerating the autotools using whatever libtool the system cam with (so that they could be marked stable even with libtool 1.5), this only worked as long as upstream didn’t decide to update the calls to libtool as they should have been doing. And more importantly, the LTDL-based packages, like PulseAudio, almost always failed to support both libtool 1.5 and 2.2, so lots moved to simply supporting 2.2. PulseAudio, well, being one of them.

So to have new versions of PulseAudio stable, we needed the new libtool stable. Piece of cake, no? Well, no, beside the packages failing with libtool-2.2 there were a few test failures reported in the stable request bug, that went lingering for a while; as much as I like autotools, I’m not the maintainer in Gentoo for any of them, since I’m not in the base system team. But since people started complaining (somewhat rightfully) about PulseAudio 0.9.9 being ancient, and asked for a stable of a newer version, on Sunday I went to look at the test failures.

The results, have been much simpler than I expected; especially counting in that nobody else looked at them in the time before, with the exception of a few users playing with the vanilla USE flag:

  • one test was caused by the elibtoolize patches — we don’t want those applied to libtool itself!
  • one test failed when the system lacked German locale (it tests localised messages, and uses the German locale to do that); this was already worked around upstream by skipping it whenever the German locale is unavailable, backported to the Gentoo ebuild;
  • one more test fails because… the other failed; the test that ensures that libtool works even when the maximum size of parameters is very short simply re-executes the testsuite recursively; this gives the test two error-out cases: when libtool is broken with short max parameters, and when something else is broken, not nice!
  • finally, there was another test failure, that I couldn’t reproduce, but that seemed tied to --as-needed; after some headbanging around, I noticed the way the test failed (indirect link), and indeed, this seems to be a problem with the older, stricter --as-needed behaviour; I already wrote about the softer --as-needed back in February, if you don’t know what I’m talking about;

As it turns out, getting the testsuite proper so that it can be trusted wasn’t especially cumbersome or painful; it only took half an afternoon of mine to fix most of the evident problems, and the last one is not a stable showstopper (as much as I like --as-needed, a failure in a testsuite because of it is not a stable showstopper, although I’d like to get the test fixed).

What can be noticed here is that once again we’re lacking people who actually go check the problems and fix them up. We need more people who can look at this kind of stuff, we need more users to insightfully note the bugs; if it wasn’t for Pacho Ramos and Dustin Polke who, in the bug looked up possible points of failures (icecream and the vanilla USE flag — the latter made a bell ring on my head about elibtoolize), then libtool 2.2 would probably be still far, far away for Gentoo stable. Thanks guys!

So I went on vacation and…

And now I got to work 18 hours a day. I wanted to write a bit about my one week in London but the time, since the start of the week, have been more than entirely spent on two main work projects. I even had to give up one project entirely because I didn’t have much time after an urgent consultancy started this week.

I have also very limited time for Gentoo as you might have noticed; I’m trying to do some of the standard maintenance operation on my packages and on Gentoo as a whole, but it really doesn’t have the same speed as I used to have. So please bear with me if the next weeks will be very slow on my Gentoo side.

On the bright note, I’ve been able to update PulseAudio to the latest test version, so you might want to try that out. And I’ll be working on a few more notes here and there, included autotools mythbuster .

Gentoo/PulseAudio Summer 2009 Plans

Against to avoid the problem of bus factor, I’m going to write down here what the plans are, for what concerns me, with PulseAudio and Gentoo for this end of Summer 2009, mostly related to what will happen when I’ll come back from my vacations in London, after mid August.

This actually is also out of candrews asking for it as I haven’t really thought about writing this before that.

So the first thing to say is that I am following PulseAudio pretty well; or rather I’m following Lennart pretty well (he’s also the one that suggested me to rewrite udev’s build system to use non-recursive automake — something I’ll write more about another day), so I’m not sleeping waiting.

Indeed, the 0.9.16 test releases are available in Gentoo already, although masked, and since recently they both support udev hotplug (preferring it over HAL), and also pass all the checks already. A note on the tests is needed though: the mix-test lacks a few entries, in particular regarding 24-in-32-bit samples, and is for this reason disabled in the current ebuild (Lennart should be working on it); at the same time, the ebuild is running test specifically in the source directory, because the intltool checks fail; badly. In theory the problem should be fixed in 0.41 series of intltool, but I am unsure whether that should be packaged or not by us.

In the next release, whether it’ll be another test release or the final release, there will also be a few differences in the handling of audio APIs. The OSS support will be restricted, masking the USE flag on Linux (leaving it enabled for FreeBSD obviously); this means that users wanting to use stuff like OSS4, which is not in Portage and if it’s for me will never be, will have to go a slightly longer way to get it to work with PulseAudio. The reason for this is that Lennart really don’t want to support that, and I can agree with him. Now, if you know the package well, you’ll probably be wondering “what about the OSS-compatibility wrapper?” this is solved already: in GIT the OSS output and wrapper supports are split in two different options, the former will be tied to the oss USE flag, the latter will be left in “auto-mode”, which will create the padsp rapper on all Gentoo Linux and FreeBSD systems. And this should fix your problem Luca!

As for some of the new features, like for instance Rygel UPnP support, well, I’ll probably be working on the sometime in the future; I do want to get Rygel in portage, especially if that will allow me to look at my vacation’s photos directly on my Sony Bravia networked TV.

PulseAudio and quirks

Seems like even my previous post about PulseAudio got one of the PA-bashers to think I’m a nuisance for their “cause”, whatever that is. For this reason I’d like to try to explain some of the quirks regarding PulseAudio, distributions, quirks and so on. Let’s call this a bit of a backstage analysis of what’s going on about Linux and audio, from somebody that has little vested interested in trying to roll the thing for PulseAudio.

The first problem to address relates to the comments that KDE people find PulseAudio a problem; I guess this has to be decomposed in a series of multiple problems: Lennart is a GTK/GNOME guy, so he obviously provided the original tools for GTK/GNOME. For a while I was interested in writing the equivalents for KDE (3) but I never had the time; now that I also moved to GNOME independently, I sincerely have no intention to write KDE tools for PA… but one has to wonder why nobody in KDE went out of his/her way to try doing this before. It’s not like it had to be part of KDE proper, it would have been okay to be an unofficial standalone application.

There is also another problem: most of the KDE guys who do see problems with PulseAudio are most likely using Phonon with xine-lib backend, configured to use the PulseAudio output plugin. Given I’m the one I wrote most of it originally, I can say that it sucks big time. Unfortunately I have had no time to work on that lately, I hope I might have that time in the future, but the two years I spent between hospitals seriously indebted me to the point I’m doing about 18 hours of work a day on average. For those who do want to use xine-lib with Pulse, I’d like to suggest the long route: set up the ALSA Pulse plugin, and then let xine just use ALSA.

There is of course another problem for KDE: while GNOME historically had no problem with force in dependencies that are Linux-specific or that work most of the time just on Linux (think about HAL adoption for instance), and relied on the actual vendors to do the eventual porting, KDE strives to work most of the time on multiple operating systems, including as of KDE 4 also Mac OS X and Windows. Now you might like this or not, but it’s their choice; and the problem is that while there is some kind of PulseAudio support for Windows, at least OSX is pretty badly shaped (also on my radar).

For what concerns distribution support, it is true that Lennart usually just care about Fedora; you have to accept this as part of the deal given RedHat is – as far as I know at least, Lennart feel free to correct me if I’m wrong – the one vendor paying his bills. Now of course we’d all love to support all the distributions at the same time, but the only way that’s possible is if multiple maintainers do coordinate; I’ve been doing my best to pass all the patches upstream when I’ve added them to Gentoo, and I see Colin Guthrie from Mandriva doing the same. One thing I can “blame” Lennart for (and I told this to him before, too!) is not creating a GIT branch with the cherry-picked patches he applies on the Fedora packaging for us to pick up… and the fact that he doesn’t like neither making releases or leaving access to others to do so.

To be honest, there is little different in this from what other projects do with distributions like Ubuntu when they are paid by Canonical. I think this is obvious, everybody looks at their little garden first. But this is not something that should concern us I guess. Gentoo has been quite out of the loop for what concerns PulseAudio, and I’m sorry, that was mostly my fault. I’m doing my best to let us update as soon as possible, but it’s not just that simple, as I already explained .

Then let me just say something about Lennart’s refusal to support system mode (which is available and advertised in Gentoo since PulseAudio entered the tree): I can’t blame him for that. First, his design for PulseAudio is based on providing something that works for the desktop use case. Something along the lines of Windows’s or OSX’s audio subsystems, neither of which provide anything akin to system mode. And indeed PulseAudio, by design, can handle the same situations, including multi-user setups with fast user switching. The fact that a system mode exists at all is due to the fact that I for one needed something like it on my setup, hacked it around for Gentoo, and then Lennart made my life easier implementing some extra bits on PulseAudio proper, but it was certainly not his idea.

What people complain about usually is the need for an X session (not strictly true, PulseAudio will start just fine in SSH — it would probably be possible to even fix it up so that it would tunnel audio just like you can tunnel X!), and the fact that audio does not continue to work when X exits (also not strictly true, if your audio player is running in screen it would be working just fine; it’s the fact that the media player crashes that makes your audio stop). Additionally people complain about the security problem of wanting to have all the processes to run under the same user, rather than allowing them to be on different users, like mpd.

Well, some complains are valid, other are not: it is true that PulseAudio does not work in multi-seat-multi-user environments, at least not with a single audio device, it is unfortunate and I don’t know if it’ll ever do work in that situation without a system mode. It is also true that running processes as different users for privileges separation does not work without system mode. But both these options are walking quite away from the the desktop design that PulseAudio is implementing; sure they are valid use cases, just like embedded systems (Palm Pre uses PulseAudio if you didn’t notice that before), but they are not what Lennart is interested in himself; at the same time I don’t think he’d be stopping anyone to improve the system mode support for those, as long as it wouldn’t require the desktop setup to make compromises.

Because the idea is, as usual in any software design, the one that you have to take compromises; Lennart wants the best experience for what concern desktop systems, and he compromises that system mode is not part of his plan, and it shouldn’t be hindering him. At the same time, while he does get upset when people ask for support about it, and he wrote why it’s not supported he hasn’t removed it (yet — if I was him, at this point I could have just removed it out of spite!). So colouring him as the master of evil does not seem the very best idea — and especially that makes me picture him in the part of Warren in the Trio, from Buffy’s season six.

Oh and a final note: it doesn’t have to surprise that Lennart and Fedora don’t care about running mpd and other services as different users, there are probably quite a few reasons for this. I cannot speak for Fedora, given I’m not involved in it, but my suppositions are that firstly the ALSA dmix plugin is somewhat scary from a security point of view (for me too) because it uses shared memory between processes from different users to do the mixing, and the second is that Fedora does a lot to use SElinux even on standard desktops. This is much tighter than separating privileges with different users since it forces the processes to behave as instructed. Unfortunately on Gentoo the SElinux support seems to have gone for good, at least to me.

How to improve releases quality: working on PulseAudio 0.9.16

Just when I said that I was resuming my work as a PulseAudio maintainer in Gentoo, Lennart released a 0.9.16-test1 tarball. This was my cue to enter the scene upstream: the first test at packaging this in Gentoo failed, for a series of different reasons, some of which are internal (we don’t have the latest version of udev available yet, I hope we will by the time PulseAudio 0.9.16 final is releasd), but most are due to upstream changes that didn’t take into consideration some corner cases that Gentoo, as usual, gets to deal with.

So you won’t see the test1 (rc1) ebuild in the tree at all, you’ll probably have to wait for test2, and even that will require some work. For now I’ve fixed all the build- and run-time issues I’ve seen in the released tarball and git repository; plus I’ve been able to get it to properly build fully on both (Gentoo/)FreeBSD and OpenSolaris (with Prefix). I haven’t been able to experiment with actually having it playing yet, but it’ll come there at one point.

Unfortunately there are still a few shady details that I or someone else has to take care of. For instance, the tests still fail consistently: last time I tried them I got two failures on Yamato, one related to IPv6 enabled in PulseAudio build, but not enabled for the kernel, resulting in the IP ACL test asseting out (now I’ve fixed it, by warning of the case, and ignoring it as a failure); the other is the mixing test, which fails for everybody because it doesn’t know anything about the 24-bit and 24-bit-in-32-bit sample types; this I extended to support 24-bit, but was unable to do anything about the 24-bit-in-32-bit because I couldn’t grok it properly.

On non-Linux operating systems (FreeBSD and OpenSolaris), I had to work on a few more issues, like implicit declarations (there still is one in OpenSolaris), shadowed names, and of course there is some slight porting to be done, which I have nowhere near finished yet: the shm (Shared Memory) support in FreeBSD is imperfect, and for neither operating systems I’ve implemented the “get process name” function.

Okay I’m not able to provide a 100% porting to all the operating systems out there, but I still think I can do a bit to help out by making sure that PulseAudio won’t need to be extensively patched by all the porters out there. And until Lennart actually gets around merging my patches, you can find all them at gitorious so you can test them.

Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. This means you can’t access those patches anymore.

Planning for PulseAudio

Thanks to Betelgeuse I finally have audio again on Yamato (again, thanks! — on a different note, this actually made me find out that there absolutely is a bug in ALSA that causes mmap to kill PulseAudio both with the ICE1712 and the HDA drivers), so I’m resuming my duty as PulseAudio maintainer. This is the reason why PulseAudio jumped to version 0.9.15-r50 in ~arch. So what’s up with that?

My current plans in respect to PulseAudio are trying to get 0.9.15 in stable to replace the ancient 0.9.9. What has stopped PulseAudio to go stable up to this point has been exactly two dependencies: OpenRC and libtool 2.2. Originally, the idea was to keep PulseAudio only compatible with OpenRC and no longer with baselayout 1; it was supposed to go stable pretty soon and the baselayout 1 init script was so scarily incomplete that we simply preferred not have to support it.

Unfortunately, there is still no date for OpenRC to go stable, if it’ll go at all in its current form. At the same time, Lennart has seriously warned against system wide mode (even though there are still valid use cases for which Gentoo often is used!) so keeping the new versions off from stable for a “minor” feature that is not even recommended to be used sounds like a bad plan.

For this reason I’ve now split the ebuild in two versions: one will keep the system mode support, with the system mode warnings, the init script and all the niceties, and the other won’t, and won’t depend on OpenRC at all; the latter is what is supposed to go stable and what stable users should locally unmask if they want PulseAudio.

Let me state again: if you want newer PulseAudio and you’re in stable explicitly request the -r1 version, not the -r50!.

Unfortunately while I should be able to ask for stable right away for what concerns time and bugs, there are a few dependencies, which include libtool 2.2 which is not stable yet (and I think it should be, the tinderbox haven’t found many libtool 2.2 bugs lately and quite a few packages started requiring that, rather than just a generic libtool that 1.5 is compatible with).

I still have no real plans for the realtime support; while Lennart released rtkit (does anybody find it concerning that Linux started having packages with names vaguely similar to those from Apple’s OS X?), it needs a patched kernel, which means I should probably be pestering our kernel team to get those patches included before we can actually provide it, even optionally.

This week I hope to be able to work on mpd too, so that the Gentoo packaging plays nice with PulseAudio (right now the fact that you have to run it with a different user forces you to use a systemwide instance).

There are flags and mlags…

In my previous post about control I stated that we want to know about problems generated by compiler flags on packages, and that filtering flags is not a fix, but rather a workaround. I’d like to expand the notion by providing a few more important insights about the matter.

The first thing to note is that there are different kind of flags you can give the compiler; some are not supposed to be tweaked by users, other can be tweaked just as fine. Deciding whether a flag should or should not be touched by the user is a very tricky matter because different persons might have different ideas about them. Myself, I’d like to throw my two eurocents in to show the discretion I use.

The first point is a repeat of what I already expressed about silly flags that can be summed up in “if you’re just copying the flags from a forum post you’re doing it wrong”. If you really know what you’re doing it should be pretty easy for you to never have problems with flags, on the other hand if you just copy what others did, there is a huge chance you’re going to get burned by something one day or the one after that.

Compilers are huge, complex beasts, and being able to understand how they work is not something for the average user. Unfortunately to correctly assess the impact of a flag on the produced code, you do need to know a lot about the compiler. For this reason you often find some of the flags listed as “safe flags”, and briefly explained. Myself, I’m not going to do that, I’m just going to talk abstractly about them.

The first issue comes with understanding that there are “free” and “non-free” optimisations: some optimisation, like almost all the ones enabled at -O2, don’t force any particular requirement on the code that the language the code is written in does not force before; actually sometimes it also makes it loose up a bit. An example of this is the dead code elimination phase that allows for functions only called in branches that are never executed to remain undefined at the final linking stage (as used by FFmpeg’s libavcodec to deal with optional codecs’ registration).

Before GCC 4.4, at least for x86, the -O2 level also didn’t enforce (at least not really) some specifications of the C language, like strict aliasing, which reduced the chances for optimisation to loosen up the type of code that was allowed to compile properly. More than an allowance from GCC, though, this was due to the fact that the compiler didn’t have much to exploit by enforcing aliasing on registry-poor architectures like x86. With GCC 4.4, relying on this is no longer possible, though.

Other flags, though, do restrict the type of code that is accepted as proper and compiled, and may cause bugs that are too subtle for the average upstream developer, which then would declare custom flags “unsupported”. Unfortunately this is not some extremely rare case, it’s actually a norm for many upstream we deal with in Gentoo. These flags, with the most prominent example being -ffast-math, break assumptions in the code; for instance this flag may provide slightly different results in mathematical functions that could lead to huge domino effects over code resolving complex formulae. On a similar note, but not the same note, the -mfpmath=sse flag allows to generate SSE (instead of i387) code for floating point operations; it’s considered “safer” because it breaks an assumption that is only valid on x86 architecture (the non-standard 80-bit size of the temporary values), and only exploited by very targeted software rather than pure C code. Indeed this is what the x86-64 compiler do by default.

There are then a few flags that only work when the code is designed to make use of them; this is the case of the -fvisibility flags family, that requires the code to properly declare the visibility of its function to work properly. Similarly, the -fopenmp flag requires the code to be written using OpenMP, otherwise it won’t magically make your software faster by using parallel optimisation (there are, though, flags that do that as far as I know they are quite experimental for now). Enabling of these flags should be left only to the actual upstream buildsystem and not by users.

Some flags might interfere with hand-written ASM code; for instance the -fno-omit-frame-pointer (need to get some decent output from kernel-level profilers), which is actually an un-optimisation, can make the ebx x86 register unavailable (when coupled with -fPIC or -fPIE at least). While I experienced myself problems with -ftree-vectorize in a single case (on x86-64 at least; on x86 I know it has created faulty code more than once, on whether this is a GCC bug or some assumption, I have no idea): with mplayer’s mp3lib and an hand-written piece of asm code that didn’t use local labels, the flag duplicated a code path and the pasted code from the asm() block tried to declare twice the same (global) label.

Finally, some flags, like -fno-exceptions and -fno-rtti for C++ can cause some pretty heavy optimisation, but should never be used if not by upstream. Doing so it will cause some hard to track down issues like the ones that Ardour devs complained about as you’re actually disabling some pretty big parts of the language, in a way that makes the resulting ABI pretty much incompatible between libraries.

And I almost forgot the most important thing to keep in mind: not always the code most optimised for execution speed is faster, which is why on the first models of x86-64 CPUs, the code produced by -Os sometimes performed better than the code produced by -O2. In this case, the relatively small L2 cache on the CPU could slow down the execution of the most aggressively optimised code because it was larger and couldn’t fit in the cache. The simplest example to understand this is to think about unrolled loops: a loop is inherently slower than the unrolled code: it needs an iterator that might not be needed otherwise, it requires to jump up the stream, it might require to actually move a cursor of some sorts. On the other hand, especially for big loop bodies (with inline and static functions included of course), unrolling the loop might result in code that requires lots of cache fetches; and on the other hand, smaller loops that can be entirely kept in cache might not take that much time to jump back since the code is already there.

So what is the bottom line of this post? One could argue that the solution is to leave it to the compiler; but as Mike points out repeatedly at least ICC is beating the crap out of GCC on newer chips (at least Intel chips; I also have some concerns about their use of tricks like the ones I said above about unsafe floating point optimisations but I don’t want to go there today). So the compiler might not know really much better.

My suggestion is to leave it to the experts; while I don’t like the idea of making it an explicit USE flag to use your own CFLAGS definition (I also want the control) we shouldn’t be, usually, overriding upstream-provided CFLAGS is they are good. Sometimes though they might require a bit of help; for instance in the case of xine I can remember the original CFLAGS definition to be pretty much crazy, with all the possible optimisations forced on even when they don’t produce that good of a result on average at all. I guess it’s all a bet.

Gentoo maintainer node and help-call: it seems like either my PCI sound card fried up or there is some nasty bug in the ALSA driver for it I don’t really have the time to deal with (I’ll be updating my previous post about it since after a few more tries it turned out not to be related to the hardware outside of Yamato). This already was a problem in the past two months or so since kernel 2.6.29 didn’t work properly, and it starts to be a big deal. My contributions to PulseAudio especially on Gentoo side has been quite hectic because of it, and the package is in serious need of ordinary and extraordinary work on it.

I might just go out one day of this week and fetch a new USB card, but to be honest I’d like to avoid that for now (I had already enough hardware failure for the past months, and a few more hardware bits that I had to replace/buy for other reasons, as well as a scheduled acquisition of one or two eSATA disks to move around data that I have no longer space for). So I added one USB soundcard (as suggested by Lennart to be fine under Linux) to my wishlist (thanks to the fact that Amazon now ships electronics components to Italy, whooo!) but I could just use some old Linux-supported card if somebody had one to give me; my only requirement is for it to support digital ouput (iec958, S/PDIF), it really doesn’t matter whether it uses coaxial or optical cable; I admit coaxial might be a bit nicer (so that the receiver can deal with both Yamato and Merrimac, with the latter only providing optical), but really either are fine.

Yes I know this sounds a lot like a shameless plug – it probably is – but I’ve got over 1300 bugs open in Bugzilla, and Yamato is crunching its hard drives to find the issues before they hit users, I guess you can let me have this plug, can’t you? Thanks.