A new XBMC box

A couple of months ago I was at LinuxTag in Berlin with the friends from VIdeoLAN and we shared a booth with the XBMC project. It was interesting to see the newest version of XBMC running, and I decided that it was time for me to get a new XBMC box — last time I used XBMC was on my AppleTV and while it was not strictly disappointing it was not terrific either after a while.

At any rate, we spoke about what options are available nowadays to make a good XBMC set up, and while the RaspberryPi is all the rage nowadays, my previous experience with the platform made it a no-go. It also requires you to find a place where to store your data (the USB support on the Pi is not good for many things) and you most likely will have to re-encode animes to the Right Format™ so that the RPi VideoCore can properly decode them: anything that can’t be hardware-accelerated will not play on such a limited hardware.

The alternative has been the Intel NUC (Next Unit of Computing), which Intel sells in pre-configured “barebone” kits, some of which include wifi antennas, 2.5” disk bays, and a CIR (Consumer Infrared Receiver) that allows you to use a remote such as the one for the XBox 360 to control the unit. I decided to look into the options and I settled on the D54250WYKH which has a Core i5 CPU, space for both a wireless card (I got the Intel 7260 802.11ac which is dual-radio and supports the new 11ac protocol, even though my router is not 11ac yet), and a mSATA SSD (I got a Transcend 128GB one), as well the 2.5” bay that allows me to use a good old spinning-rust harddrive to store the bulk of the data.

Be careful and don’t repeat my mistake! I originally ordered a very cool Western Digital Caviar Green 2TB HDD but while it is a 2.5” HDD, it does not fit properly in the provided cradle; the same problem used to happen with the first series of 1TB HDDs on PlayStation 3s. I decided to keep the HDD and bring it with me to Ireland, as I don’t otherwise have a 2TB HDD, instead I opted for a HGST 1.5TB HDD (no link for this one as I bought it at Fry’s the same day I picked up the rest, if nothing else because I had no will to wait, and also because I forgot I needed a keyboard).

While I could have just put OpenELEC on the device, I decided instead to install my trusted Gentoo — a Core i5 with 16GB of RAM and a good SSD is well in its ability to run it. And since I was finally setting something up that needs (for myself) to turn on very quickly, I decided to give systemd a go (especially as Robbins is now considered a co-maintainer for OpenRC which drains all my will to keep using it). The effect has been stunning, but there are a few issues that needs to be ironed out; for instance, as far as I can tell, there is no unit for rngd which means that both my laptop (now converted to systemd) and the device have no entropy, even though they both have the rdrand instruction; I’ll try to fix this lack myself.

Another huge problem for me has been getting the audio to work; while I’ve been told by the XBMC people that the NUC are perfectly well supported, I couldn’t for the sake of me get the audio to work for days. At the end it was Alexander Patrakov who pointed out to intel_iommu=on,igfx_off as a kernel option to get it to work (kernel bug #67321 still unfixed). So if you have no HDMI output on your NUC, that’s what you have to do!

Speaking about XBMC and Gentoo, the latest version as of last week (which was not the latest upstream version, as a new one got released exactly while I was installing the box), seem to force you to install FFmpeg over libav – I honestly felt a bit sorry for the developers of XBMC at LinuxTag while they were trying to tell me how the multi-threaded h264 decoder from FFmpeg is great… Anton, who wrote it, is a libav developer! – but even after you do that, it seems like it does not link it in, preferring a bundled copy of it instead. Which also doesn’t seem to build support for multithread (uh?). This is something that I’ll have to look into once I’m back in Dublin.

Other than that, there isn’t much to say; the one remaining big issue is to figure out how to properly have XBMC start up at boot without nasty autologin hacks on systemd. And of course finding a better way than using a transmission user to start the Transmission daemon, or at least find a better way to share the downloads with XBMC itself. Probably separating the XBMC and Transmission users is a good idea.

Expect more posts on what’s going on with my XBMC box in the future, and take this one as a reference about the NUC audio issue.

Looking for symbols? elfgrep to the rescue!

About three years after starting my work on Ruby-Elf I finally implemented one of the scripts I wanted to write for the longest tile: elfgrep. At the name implies it’s a tool with a grep-like interface to look up symbols defined and used in ELF files.

I have avoided writing it for a long time because scanelf (part of pax-utils) implements already a similar, but definitely not identical, feature through the -gs options. The main feature missing in scanelf is the ability to look for multiple symbols at once: it does allow you to specify multiple symbols, but then again it only prints the first one found, rather than all of them.

The other night, mru from FFmpeg suggested me another limitation of scanelf: it cannot be used to look for symbols depending on their version information (for GNU systems). So I finally decided to start writing my own. Thankfully, Ruby-Elf was designed to be easy to extend, if anything, so the original implementation to do the job it was aimed for only required 83 lines of Ruby code, including my license header.

Right now the implementation is a bit more complex, and so it has more lines of code, but it implements a number of switches analogue to those in grep itself, that makes it a very flexible tool to find both definitions and uses of symbols: you can either look for the library defining a given symbol or the objects making use of those; you can get the type of symbols (it has an output similar to nm(1)), or you can simply list the files that matched or that didn’t match. You can also count symbols, without having to go through wc -l thanks to the -c option, and the list output is suitable to use with xargs -0 as well.

Most of the time, when analysing the output of a library, I end up having to do something like nm | grep; this unfortunately doesn’t work that well when you have multiple files, as you lose sight of the file that actually hits; elfgrep solves this just fine as it prefixes the file’s path to the nm-like output, which makes it terrific to identify which object file exports a given symbol, for instance.

All in all, I’m very very happy at how elfgrep turned out to be, so I’ll likely try to make a release of ruby-elf soonish; but to do so I have to make it a Ruby Gem, just for the sake of ease of distribution; I’ll look at it in the next week or so. In the mean time you can find the sources on the project’s page and on my overlay you find an ebuild that installs it from Git until I make a release (I’ll package it in main tree as soon as it is!).

If you have any particular comment, patch, request, or anything like that, feel free to send me an email, you find the references above.

Some new notes about AppleTV

Another braindump so I can actually put in public what I’ve been doing in my spare time lately, given that most likely a lot of that won’t continue in the next months, as I’m trying to find more stable, solid jobs than what I’ve been doing as of lately.

If you follow me for a long time you might remember that a few years ago I bought an AppleTV (don’t laugh) for two main reasons: because I actually wanted something in my bedroom to watch Anime and listen to music and was curious about the implementation of it from a multimedia geek point of view. Now, a lot of what I have seen with the AppleTV is negative, and I’m pretty sure Apple noticed it just as well as I have. Indeed they learn from a lot of their previous mistakes with the release of the new AppleTV. Some of the “mistakes they learnt from” would probably not be shared by Free Software activists and hackers, as they were designed to keep people out of their platform, but that’s beside the point now.

The obvious problems (bulkiness, heat, power) were mostly fixed in hardware by moving from a mobile i686-class CPU to an ARM-based embedded system; the main way around their locks (the fact that the USB port is a standard host one, not a gadget one, and it only gets disabled by the lack of the Darwin kernel driver for USB) is also dropped, but only to be replaced with the same jailbreak situation they ahve on iPhone and other devices. So basically while they tried to make things lot more difficult, the result is simply that they hacked it in a different way. While it definitely looks sleeker to keep near your TV, I’m not sure I would have bought it if it was released the first time around this way.

At any rate, the one I have here is in its last months, and as soon as I can find something that fits into its space and on which I can run XBMC (fetching videos out of a Samba share on Yamato), I’ll probably simply get rid of it, or sell it to some poor fellow who can’t be bothered with getting something trickier but more useful. But while I want the device to actually accept the data as I have it already for what concerns Anime and TV series (assuming I can’t get them under decent terms legally), some months ago I decided that at least the music can bend over to the Apple formats — for the simple reason that they are quite reasonable, as long as I can play them just fine in Europe.

Beside a number of original music CDs (Metal music isn’t really flattered by the compression most online music stores apply!), I also have (fewer) music DVDs with videos and concerts; plus I sometime “procure” myself Japanese music videos that haven’t been published in the western world (I’m pretty much a lover of the genre, but they don’t make it too easy to get much of it here; I have almost all of Hikaru Utada’s discography in original forms though). For the formers, Handbrake (on OS X) did a pretty good job, but for the new music videos, which are usually in High Definition, it did a pretty bad job.

Let’s cue back FFmpeg, which, since last time I ranted actually gained a support for the mov/mp4 format that is finally able to keep up with Apple (I have reported some of the problems about it myself, so while I didn’t have a direct bearing in getting it to work, I can say that at least I feel more confident of what it does now). To be honest, I have tried doing the conversion with FFmpeg a few times already; main problem was to find a proper preset for x264 that didn’t enable features that AppleTV failed to work with (funnily enough, since Handbrake also uses x264, I know that sometimes even though iTunes does allow the files to be copied over the AppleTV, they don’t play straight). Well, this time I was able to find the correct preset on the AwkwardTV wiki so after saving it to ~/.ffmpeg/libx264-appletv.ffpreset the conversion process seemed almost immediate.

A few tests afterward, I can tell it’s neither immediate in procedure, nor in time required to complete. First of all, iTunes enforces a frame size limits on the file; while this is never a problem for content in standard definition, like the stuff I ripped from my DVDs, this can be a problem with High-Definition content. So I wrote a simple script (that I have pasted online earlier tonight but I’ll publish once polished a bit more) that through ffprobe, grep and awk could maintain the correct aspect ratio of the original file but resize it to a frame size that AppleTV is compatible with (720p maximum). This worked file for a few videoclips, but then it started to fail again.

Turns out, 1280×720 which is the 720p “HD Ready” resolution, is too much for AppleTV. Indeed, if you use those parameters to encode a video, iTunes will refuse to sync it over to the device. A quick check around pointed me at a possible reasoning/solution. Turns out that while all the video files have a Source Aspect-Ratio of 16:9, their Pixel Aspect-Ratio is sometimes 1:1, sometimes 4:3 (see Wikipedia’s Anamorphic Widescreen article for the details and links to the description of SAR and PAR). While Bluray and most other Full HD systems are designed to work fine with a 1:1 PAR (non-anamorphic), AppleTV doesn’t, probably because it’s just HD Ready.

So a quick way to get the AppleTV to accept the content is simply to scale it back to anamorphic widescreen, and you’re done. Unfortunately that doesn’t seem to cut it just yet; I have at least one video that doesn’t work even though the size is the same as before. Plus another has 5.1 (6-channels) audio, and FFmpeg seems to be unable to scale it back to stereo (from and to AAC).

I do support FSFE… it’s positive!

A Free Coffee

I have, before, written about my concerns regarding the way the Free Software Foundation is working nowadays, and the fact that I feel RMS is taking too seriously his role as a “semi-religious” figure (and the whole “Church of Emacs” business). On the other hand, I’m happy to be a supporter of Free Software Foundation Europe. I do find the two taking pretty different stances on a lot of things.

Before leaving for FOSDEM, I read (and re-dented) Lydia’s link to a post by Joe Brockmeier (of OpenSUSE fame — of course there will be a vocal minority that will find his involvement with OpenSUSE, and thus Novell, as a bad sign to begin with, I feel happy that I’m not that closed minded) that summarised quite well my feeling with the way Free Software Foundation is behaving nowadays. Let me quote Joe:

Update (2017-04-21): Joe’s article is gone from the net and I can’t even find it on the Wayback Machine. Luckily I quoted it here!

It isn’t that the folks at the Free Software Foundation are wrong that DRM is bad for users, it’s that they are taking an entirely negative and counter-productive approach to the problem. Their approach to “marketing” may resonate with some in the FLOSS community, but their efforts are not at all likely to win hearts and minds of users who don’t get out of bed in the morning singing the Free Software Song.

While Defective By Design highlights legitimate problems with the iPad (and other products) where are the alternatives? Stop telling people what they shouldn’t buy, and make it easier for them to get hands on some kit that lets them do what they want to do with free software. In other words, stop groaning about Apple and deliver a DRM [I guess he meant DRM-free here — Flameeyes] device of your own, already.

And I agree with him wholeheartedly (of course as long as my note above is right): we should propose alternatives, and they need to be valid alternatives. When I say that I use an iPod because it has 80GB of storage space on it (well, my current, old version has 80GB, newer versions have 160GB of course), people suggest me as an alternative to not carry around so much music. Well, I do want to carry around that much music! If you can get me a player with an equivalent disk space and featureset I’d be grateful to get rid of Apple’s lock-ins… while that’s not available, I don’t really care about reducing my music library, as long as I can use it with Rhythmbox and other Free Software tools.

On the other hand, I cannot praise enough one in particular of the FSFE projects: PDFreaders.org website. Instead of telling the users how bad Adobe is, the site provides them with valid alternatives, specific to their operating system! This includes even the two biggest proprietary operating systems, Windows and Mac OS X. Through this website I actually was able to get more people used to Free Software, as they are glad to use something that is, in many ways, better than Adobe’s own Reader.

As I keep repeating, to bring Free Software to the masses, we need to be able to reach and improve over the quality of proprietary software. We are able to do that, we did so before, and we keep doing so in many areas (it’s definitely not a random chance that FFmpeg is one of the most widely used Free Software projects, sometimes even unbeknownst by its users, on the most varied platforms). When we settle for anything less, we’re going to lose. When we say that something is better and everybody should use that just because it’s Free, then we’re deluding ourselves.

I’m not sure what will happen with OpenOffice now that Oracle ate Sun as a snack, but if this will bring enough change in the project, it might actually make it really go mainstream. Right now, myself, I feel it has so many holes that it’s not even funny… on the other hand, as I wrote, it has some very important strong points, including the graphing capabilities (not charting!), and of course, the fact that it is Free Software.

FOSDEM 2010 Recap

So, for the first time in my life, if we exclude the local Linux Day events, I attended a conference! FOSDEM 2010 has been my first time properly meeting other developers out there. It actually was a bit more travel than just Bruxelles, for me; I actually took a long way to get there. Since I was still afraid of planes, I didn’t want to go up there alone. Add to that, the fact that I’m neither used to Bruxelles area, nor I speak any decent French any more (I studied it in middle-school, so I could at least ask for, and listen to, directions, but in over ten years not using it, it really just went away), so I got there with Luca who lives in Turin (in the other side of Italy).

The end result looks something like this: I left Mestre (the Venice inland city, which is where I actually live) by train, I changed in Milan, then arrived in Turin; I went to dinner with some friends I only met online before (colleagues and fellow Ultima OnLine players), and slept at Alessandro’s – from lscube – flat. In the morning me and Luca took the plane for Rome, then changed to the one for Bruxelles. Our luggage decided to take a later plane (d’oh!). The same travel (minus the luggage nuisance, fortunately) applied to the way back. This resulted in something like five trains (one from the Bruxelles Airport to the Gare du Nord — we took a cab to go back), and four planes. I think my fear of planes was totally cured this time.

FOSDEM itself was lots fun! I finally met lots of other Gentoo developers (including Luca for the first time), the other FFmpeg guys, some of the VLC guys, and quite a few users who knew me, even though I didn’t know them before, which I have to say has a nice feeling to it. And I even met with a Mono team delegation, and with the one guy that I had a rough start with – Jo Shields, “directhex” – I should report every misunderstanding is cleared. I was also able to (very briefly) meet Lennart, but that was when me and Luca really had to hurry to catch our plane back.

I really would have liked to stay the whole Sunday and leave on Monday, but Luca was actually due to be back in Turin for other reasons, so we had to live early on Sunday to get back to Italy before all planes stopped flying.

Now, during FOSDEM I picked up a few extra tasks other than all the stuff that I’ve had already planned, and this means that the next few days will get me almost no time to breath, to take a break, or to go out with friends. That’s fine, I had four days that relaxed me quite a bit, so this is not too bad to do. Just so I can name some of the tasks that I’m looking forward for, beside the key signing (that was a “cool” party… even though it was maybe too cold), is writing something more about release notifications as it seems like I’m not the only person having a problem with that, trying to write some more about upstreaming patches, and packaging SIP Communicator – a demo of which was available next to the FFmpeg stand in the AW building… looked very promising, and getting an hash table implementation in libavutil for FFmpeg, so that we can use it on feng and libnemesi and thus get a good parser, finally!

Anyway this is enough for today, hope the other people at FOSDEM found it at least as fun, for me is time to hit (finally, my) bed.

Application Binary Interface

Following my earlier post about libtool archives, Jorge asked me how they relate to the numbers that come at the end of shared objects; mostly, they don’t, with the exception that the libtool archive keeps a list of all the symlinks with different suffix components for a given library.

I was, actually, already planning on writing something about shared object versioning since I did note about the problems with C++ libraries related to the “Ardour Problem”. Unfortunately it requires first some knowledge of what composes an ABI, and that requires me to write something before going deep in shared object versioning. And this hits on my main missing necessity right now: time. Since I have now more or less two jobs at hand, the time I can spare is dedicated more to Gentoo or my personal problems than writing in-depth articles for the blog. You can, anyway, always bribe me to write about a particular topic.

So let’s start with the basic of what the Application Binary Interface (ABI) is, in the smaller context that I’m going to care about, and how it relates to the shared object versioning topic I wanted to discuss. For simplicity, I’m not going to discuss issues like the various architecture-dependent calling conventions, and, for now, I’m also focusing on software written in C rather than C++; the ABI problems with the latter are an order of magnitude worse than the ones in C.

Basically, the ABI of a library is the interface between that and the software using it; it relates to the API of the interface, but you can maintain API-compatibility and break ABI-compatibility, since in the ABI, you have to count in many different factors:

  • the callable functions, with their names; adding a function is a backward-compatible ABI change (although it also means that you cannot execute something built against the newer library on a system with an older one), removing or renaming a function breaks ABI;
  • the parameters of the function, with their types; while restricting the parameters of a function (for instance taking a constant string rather than a non-constant string) is ABI-compatible, removing those restrictions is an ABI breakage; changing compound or primitive type of a parameter is also an ABI change, since you change their meaning; this is also why using parameters with types like off_t is bad (it depends on the feature-selection macros used);
  • the returned value of functions; this does not only mean the type of it, but also the actual meaning of the value;
  • the size and content of the transparent structures (i.e.: the structures that are defined, and not just declared, in the public header files);
  • any major API change also produces an ABI change (unless symbol versioning is used to keep backward-compatibility); it’s particularly important to note that changing how a dynamically-allocated returned value is allocated does change API and ABI if there is not a symmetrical function to free it; this is why, even for very simple data types, you should have a symmetrical alloc/free interface.

Now there are a few important notes about what I just wrote, and to explain them I want to use FFmpeg as an example; it is often said that FFmpeg has no warranties of ABI compatibility with the same shared object version (I’ll return to that at another time); this is false because FFmpeg developers do pay attention to keep the public ABI compatible between releases as long as the released shared object has the same version. What they don’t guarantee is the ABI-compatibility for internal symbols, and software like VLC, xine and GStreamer used to use the internal symbols without thinking about it twice.

This is why it’s important to use symbol visibility to hide the internal symbols: once you have hidden them you can do whatever you want with them, since no software can rely on them, and have subtle crashes or misbehaviour because of a change in them. But that’s a topic for another day.

Miracle on the nth try: OpenSolaris on KVM

So after my previous post about virtualisation software I decided to spend some extra time on trying out KVM, manually. Having to manually set the macaddress every time is a bit obnoxious but thanks to alias I can do that at least somewhat fine.

KVM is also tremendously faster compared with QEmu 0.10 using kqemu; I’m curious to see how the thing will change with the new 2.6.29 kernel where QEmu will be able to use the KVM device itself. At any rate, the speed of FreeBSD in the KVM virtual system is almost native and worked quite nicely. It also doesn’t hog the CPU when it’s idling, which is quite fine too.

As I’ve written, though, OpenSolaris also refused to start; after thinking a bit around, I thought about the amount of memory and… that was it. With the default 128MB of RAM provided by KVM and QEmu, OpenSolaris cannot even start the text-mode installation. Giving it 1 GB of memory actually made it work. Fun.

As Pavel points out in the previous post, though, the default QEmu network card will blatantly fail to work with OpenSolaris; Jürgen is right when he says that OpenSolaris is quite picky with its hardware. At any rate the default network card for KVM (RTL8169) seems to work just fine. And networking is not lagged like it is on VirtualBox, at all.

I’ve now been working on getting Gentoo Prefix on it already, and then I’ll probably resume my work on getting FFmpeg to build, since I need that to work on lscube . For now, though, it’s more a matter to have it installed.

Later this week I’ll probably also make use of its availability to work on Ruby-Elf more and in particular on the two scripts I want to write to help identify ABI changes and symbol collisions inside a given executable, that I promised in the other previous post .

More tinderboxing, more analysis, more disk space

Even though I had a cold I’ve kept busy in the past few days, which was especially good because today was most certainly Monday. For the sake of mental sanity, I’ve decided a few months ago that the weekend is off work for me, and Monday is dedicated at summing up what I’m going to do during the rest of the week, sort of a planning day. Which usually turns out to mean a lot of reading and very little action and writing.

Since I cannot sleep right now (I’ll have to write a bit about that too), I decided to start with the writing to make sure the plans I figured out will be enacted this week. Whih is especially considerate to do considering I also had to spend some time labelling, as usual this time of the year. Yes I’m still doing that, at least until I can get a decent stable job. It works and helps paying the bills at least a bit.

So anyway, you might have read Serkan’s post regarding the java-dep-check package and the issus that it found once run on the tinderbox packages. This is probably one of the most interesting uses of the tinderbox: large-scale testing of packages that would otherwise keep such a low profile that they would never come out. To make more of a point, the tinderbox is now running with the JAVA_PKG_STRICT variable set so that the Java packages will have extra checks and would be much more safely tested on the tree.

I also wanted to add further checks for bashisms in the configure script. This sprouted from the fact that, on FreeBSD 7.0, the autoconf-generated configure script does not discard the /bin/sh shell any longer. Previously, the FreeBSD implementation was discarded because of a bug, and thus the script re-executed itself using bash instead. This was bad (because bash, as we should really well know, is slow) but also good (because then all the scripts were executed with the same shell on both Linux and FreeBSD). Since the bug is now fixed, the original shell is used, which is faster (and thus good); the problem is that some projects (unieject included!) use bashisms that will fail. Javier spent some time trying to debug the issue.

To check for bashisms, I’ve used the script that Debian makes available. Unfortunately the script is far from perfect. First of all it does not really have an easy way to just scan a subtree for actual sh scripts (using egrep is not totally fie since autoconf m4 fragments often have the #!/bin/sh string in them). Which forced me to write a stupid, long and quite faulty script to scan the configure files.

But even worse, the script is full of false positives: instead of actually parsing its semantics, it only scans for substrings. For instance it identified the strange help output in gnumeric as a bash-specific brace expansion, when it was in an HEREDOC string. Instead of this method, I’d probably take a special parameter in bash that tells the interpreter to output warnings about bash-specific features, maybe I should write it myself.

But I think that there are some things that should be addressed in a much different way than the tinderbox itself. Like I have written before, there are many tests that should actually be executed on source code, like static analysis of the source code, and analysis of configure scripts so to fix issues like canonical targets when they are not needed, or misaligned ./configure --help output, and os on so forth. This kind of scans should not be applied only to released code, but more importantly on the code still in the repositories, so that the issues can be killed before the released code.

I had this idea when I went to look for different conditions on Lennart’s repositories (which are as usually available on my own repositories with changes and fixes and improvements on the buildsystem – a huge thanks to Lennart for allowing me to be his autotools-meister). By build-checking his repositories before he makes release I can ensure the released code works for Gentoo just fine, instead of having to patch it afterwards and queue the patch for the following release. It’s the step beyond upstreaming the patches.

Unfortunately this kind of work is not only difficult because it’s hard to write static analysis software that gets good results; US DHS-founded Coverity Scan, although lauded by people like Andrew Morton, had tremendously bad results in my opinion with xine-lib analysis: lots of issues were never reported, and the ones reported were often enough either false positives or were inside the FFmpeg code (which xine-lib used to import); and the code was almost never updated. If it didn’t pick up the change to the Mercurial repository, that would have been understandable, I don’t pretend them to follow the repository moves of all the projects they analyse, but the problem was there since way before the move. And it also reported every and each day the same exact problems, repeated over and over; for a while I tried to keep track of them and marked hte ones we already dealt with or which were false positives or were parts of FFmpeg (which may even have been fixed already).

So one thing to address is to have an easy way to keep track of various repositories and their branches, which is not so easy since all SCM programs have different ways to access the data. Ohloh Open Hub has probably lots of experience with that, so I guess that might be a start; it has to be considered, though, that Open Hub only supports the three “major” SCM products, GIT, Subversion and the good old CVS, which means that extending it to any repository at all is going to take a lot more work, and it had quite a bit of problems with accessing Gentoo repositories which means that it’s certainly not fault-proof. And even if I was able to hook up a similar interface on my system. it would probably require much more disk space that I’m able to have right now.

For sure now the first step is to actually write the analysis script that first checks the build logs (since anyway that would already allow to have some results, once hooked up with the tinderbox), and then find a way to identify some of the problems we most care about in Gentoo from static analysis of the source code. Not an easy task or something that can be done in spare time so if you got something to contribute, please do, it would be really nice to get the pieces of the puzzle up.

Can you make it?

After my post about AppleTV conversion with FFmpeg I’ve been working on trying to streamline my conversion procedure even further. The low-quality of the conversion is rarely an issue for me since what I’m uploading in there is usually something I watch before I can get to the DVDs. I have sometimes to add borders or subtitles would disappear over the edge of the TV, but not always since the content might not be subtitled.

Unfortunately it seems like GNU make has issues where paths with spaces are involved, and there are some other things that are clumsy and require me to write the same rule twice to get it to properly work. At the end it doesn’t seem like GNU make is the right tool for the job, so I gone looking for something different.

The rake option was discarded right away since rake does not parallel-execute, so I went on looking for something. I just needed a make-workalike, rather than a build system, or even a scripting language with proper support for parallelising, something like bash’s for construct but having a number of concurrent jobs applicable to it.

Looking for that in Google I came through an article about Make alternatives on freshmeat. It looked promising at the start but then committed the mistake of confusing make itself and higher level build systems, and went on for most of the time to speak about alternative build system. It still had a few names, among which I took out cook that at first looked interesting.

The syntax seems to be similar enough to make to be easy to grasp, and still allows much more flexibility than the standard make. So what’s the catch? Well, the catch begins with it using yet another strange SCM (like Pidgin’s monotone wasn’t annoying enough), but it doesn’t stop here.

Out of habit I check the ebuilds when I want to try new software, it helps to know whether the original code is so messy that we have to patch the hell out of it; in addition to this for buildsystems, having complex ebuilds means that the build system architects themselves don’t have very clear ideas on what it’s needed. Seems like this was a good idea:

src_compile() {
        econf || die "./configure failed"
        # doesn't seem to like parallel
        emake -j1 || die

src_install() {
        # we'll hijack the RPM_BUILD_ROOT variable which is intended for a
        # similiar purpose anyway
        make RPM_BUILD_ROOT="${D}" install || die

So a make-replacement that on its homepage declares to be able to build in parallel… has a build system that does not build in parallel? Indeed the makefile is an absolute mess; while it’s true that you don’t need to know one tool to write a replacement for it (that is not a drop-in replacement), it still would help. I could understand the mistake with multiple output rules since the cook syntax diverges from make’s in that regard; I started understanding much less the mistakes about temporary file creation, but I was really put off by the fact that it does not even know how to use the -o switch at the compiler to get it to output to a filename that is different from basename.o.

Let me say one thing: before you decide to write a tool that is better than $something, try at least to know $something well enough so that you don’t reinvent the wheel, squared.

When upstream lacks a testsuite

… you can make your own, or try at least.

I maintain in portage a little ebuild for uif2iso; as you probably already know, the foo2iso tools are used t convert various type of proprietary disk images from Windows proprietary software into ISO9660 images that can be used under Linux. Quite obviously, making unit testing out of such a tool is pointless, but regression testing at tool level might actually work. Unfortunately for obvious reasons upstream does not ship testing data.

Not exactly happy with this, I started considering what solution I had, and thus my decision: if upstream does not ship with any testsuite, I’ll make one myself. The good thing with ebuilds is that you can write what you want for the test in src_test. I finally decided to build an UIF image using MagicISO on my Windows XP vbox, download it together with the MD5 digest of the files I’d put in it conditionally to the test USE flag, and during the test phase convert it to ISO, extract the files, and check that the MD5 digest is correct.

Easier said than done.

To start with I had some problem deciding what to put on the image; of course I could use some random data, but I thought that at that point I could at least make it funny for people to download the test data, if they wanted to look at it. My choice fell on just finding some Creative Commons-licensed music and use a track from that. After some looking around, my choice went to the first track of Break Rise Blowing by Countdown on Jamendo.

Now, the first track is not too big so it’s not a significant overhead to download the test data, but there is another point here: MagicISO has three types of algorithms used: default, best compression and best speed; most likely they are three compression levels in lzma or something along those lines, but just to be safe I’d just put all three of those to the test. The resulting file with the three UIF images and the MD5 checksums was less than 9MB, so an acceptable size.

At that point, I started writing the small testsuite, and the problem started: uif2iso always returns 1 at exit, which means you can’t use || die or it would always die. Okay good enough, just check that the file was created. Then you have to take the files out, nothing is that easy when you got libarchive that can extract ISO files like they were tarballs; just add that as a dependency with the test USE flag enabled, a bit of overhead but at least I can easily extract the data to test.

It seems instead that the ISO file produced by uif2iso is going to be a test for libarchive instead, since the latest release fails to extract it. I mailed Tim and I hope he can fix it up for the next release (Tim is fantastic with this, when 2.5.902a was released, I ended up finding a crasher on a Portage-generated binpkg, I just had to mail it to him, and in the next release it was fixed!). The ISO file itself seems fine, since loop-mounting it works just fine. The problem is that I know no other tool that can extract ISO images quickly and without having to command it file by file (iso-read from libcdio can do that, it’s just too boring); if somebody has suggestions I’m open to them.

This is the fun that comes out of writing your own test cases I guess, but on the other hand I think it’s probably a good idea to keep the problematic archives around, if they have no problems with licenses (Gentoo binpkgs might, since they are built from sources and you’d have to distribute the sources with the binaries, which is why I wanted some Creative Commons licensed content for the images), since that allows you to test stuff that broke before to ensure it never breaks again. Which is probably the best part of unit, integration and system testing: you take a bug that was introduced in the past, fix it and write a test so that, if it is ever reintroduced it would be caught by the tests rather than by the users again.

Has anybody said FATE ?