Book Review: Instant Mercurial Distributed SCM Essentials How-to

Okay the title is a mouthful for sure, but this new book from Packt Publishing is an interesting read for those who happen to use Mercurial only from time to time and tends to forget most of the commands and workflows, especially when they differ quite a bit from the Git ones.

While I might disagree with using some very unsafe examples (changing the owner of /etc/apache to your user to experiment on it? Really?), the book is a very quick read and I feel like for the price it’s sold by Packt (don’t get distracted by the cover above, that links to Amazon) it’s worth a read, and keeping it on one’s shelf or preferred ebook reader device.

Well, not sure if I can add more to this, I know it sounds like filler, but the book is short enough that trying to get into more details about the various recipes it proposes would probably repeat it whole. As I said, in general, if you have to work with Mercurial for whatever reason, go for it!

Distributed SCM showdown: GIT versus Mercurial

Although I admit it’s tempting, I’m not going to enter the mess of complaints (warranted or not) about GIT that have found place on Planet GNOME. I don’t intend to go down on what my issues are with bzr either, since I think I exposed them already. I’m going to comment on a technical issue I have with Mercurial, and show why I find GIT more useful, at least in that case.

If you remember xine moved to Mercurial almost two years ago. The choice of Mercurial at the time was pushed because it seemed much more stable (git indeed had a few big changes since then), it was already being used for gxine, and it had better multi-platform support (running git on Solaris at the time was a problem, for instance). While I don’t think it’s (yet) the time to reconsider, especially since I haven’t been active in xine development for so long that my opinion wouldn’t matter, I’d like to share some insight about the problems I have with Mercurial, or at least with the Mercurial version that Alioth is using.

Let’s not start with the fact that hg does not seem to play too well with permissions, and the fact that we have a script to fix them on Alioth to make sure that we can all push to the newly created repositories. So if you think that setting up a remote GIT repository is hard, please try doing so with Mercurial, without screwing permissions up.

For what concerns command line interface, I agree that hg follows more the principle of least surprise, and indeed has an interface much more similar to CVS/SVN than git has. On the other hand, it requires quite a bit of wondering around to do stuff like git rebase, and it requires enabling extensions that are not enabled by default, for whatever reason.

The main problem I got with HG, though, is with the lack of named branches. I know that the newer versions should support them but I have been unable to find documentation about them, and anyway Alioth is not updated so it does not matter yet. With the lack of named branches, you basically have one repository per branch; while easier to deal with multiple build directories, it becomes quite space-hungry since the reflog is not shared between these repositories, while it is in git (if you clone one linux-2.6 repository, then decide you need a branch from another developer, you just add that remote and fetch it, and it’ll download the minimum amount of changesets needed to fill in the history, not a whole copy of the repository).

It also makes it much more cumbersome to create a scratch branch before doing more work (even more so because you lack a single-command rebase and you need to update, transplant and strip each time), which is why sometimes Darren kicked me for pushing changes that were still work in progress.

In git, since the changesets are shared between branches, a branch is quite cheap and you can branch N times without almost feeling it, with Hg, it’s not that simple. Indeed, now that I’m working at a git mirror for xine repositories I can show you some interesting data:

flame@midas /var/lib/git/xine/xine-lib.git $ git branch -lv
  1.2/audio-out-conversion   aafcaa5 Merge from 1.2 main branch.
  1.2/buildtime-cpudetection d2cc5a1 Complete deinterlacers port.
  1.2/macosx                 e373206 Merge from xine-lib-1.2
  1.2/newdvdnav              e58483c Update version info for libdvdnav.
* master                     19ff012 "No newline at end of file" fixes.
  xine-lib-1.2               e9a9058 Merge from 1.1.
flame@midas /var/lib/git/xine/xine-lib.git $ du -sh .
34M	.

flame@midas ~/repos/xine $ ls -ld xine-lib*   
drwxr-xr-x 12 flame flame 4096 Feb 21 12:01 xine-lib
drwxr-xr-x 13 flame flame 4096 Feb 21 12:19 xine-lib-1.2
drwxr-xr-x 13 flame flame 4096 Feb 21 13:00 xine-lib-1.2-audio-out-conversion
drwxr-xr-x 13 flame flame 4096 Feb 21 13:11 xine-lib-1.2-buildtime-cpudetection
drwxr-xr-x 13 flame flame 4096 Feb 21 13:12 xine-lib-1.2-macosx
drwxr-xr-x 12 flame flame 4096 Feb 21 13:28 xine-lib-1.2-mpz
drwxr-xr-x 13 flame flame 4096 Feb 21 13:30 xine-lib-1.2-newdvdnav
drwxr-xr-x 13 flame flame 4096 Feb 21 13:50 xine-lib-1.2-plugins-changes
drwxr-xr-x 12 flame flame 4096 Feb 21 12:53 xine-lib-gapless
drwxr-xr-x 12 flame flame 4096 Feb 21 13:56 xine-lib-mpeg2new
flame@midas ~/repos/xine $ du -csh xine-lib* | grep total
805M	total
flame@midas ~/repos/xine $ du -csh xine-lib xine-lib-1.2 xine-lib-1.2-audio-out-conversion xine-lib-1.2-buildtime-cpudetection xine-lib-1.2-macosx xine-lib-1.2-newdvdnav  | grep total
509M	total

As you might guess the ~/repos/xine content are the Mercurial repositories. You can see the size difference between the two SCMs. Sincerely, even though I have tons of space, on the server I’d rather keep git rather than Mercurial.

If some Mercurial wizard knows how to work around this issue I got with Mercurial, I might consider it again, otherwise for the future it’ll always be git for me.

Integrating FFmpeg in xine-lib

Beside the ABI breakage that is needed to reduce the structures’ size, that as I blogged yesterday helps quite a bit, another of my currently worked on branches is the FFmpeg integration.

Now, xine-lib has used FFmpeg since ever, and it has an internal copy of it (although on Gentoo you use the external copy of it to reduce the size of the code, and avoid having double security problems), but that copy requires rewriting the build system from FFmpeg’s own makefiles to files, which is far from trivial and quite error prone.

It also requires to clone in the tests for the features that are tested in FFmpeg’s own configure script, which is a waste of time when using external FFmpeg most of the time, and needs to be maintained over the long run as they might be modified by FFmpeg developers, and we can’t just copy them 1:1 as they don’t use autoconf either.

To solve this issue, I’ve been working before to implement a sort of Chinese Wall, so that FFmpeg’s build system could be called from xine’s automake build system without need to accommodate the Makefiles every time an upgrade is needed.

It wasn’t that difficult to begin with, but there was an obstacle with the make dist command that is used to create the distribution tarball, as there was no way to seemingly merge its processing with FFmpeg’s build system; today I then implemented a dist-hook that takes the list of files to add for FFmpeg support to the tarball. It requires an extra step during the update of the internal copy, but that’s far from being a problem, as it’s certainly more straightforward than it was to update the previous FFmpeg copy.

With this problem fixed, I was finally able to start working on getting distcheck working and thus trying to get a working branch out of it. The result is quite good, as I was able to get a distcheck running already, and I’m now working on actually implementing Miguel’s idea to disable the “uncommon” audio/video decoders present in FFmpeg; his work on the current 1.1 series is basically half-implemented, and probably never to be fixed, as it would require to change all the so that they take into consideration the huge amount of conditionals currently present in FFmpeg; I think once it’s completed I’ll just ask Miguel to give up on having it working in 1.1 series, as it’s unlikely to ever work decently… well, it might even work, but refreshing FFmpeg would require a huge amount of work, and we are understaffed already.

And by working on this branch, and on this idea, I already discovered two bugs in FFmpeg that I’ve locally patched.. the patches are now waiting (as usual) in ffmpeg-devel to be applied.

Hopefully, this will also be integrated in 1.2 series, that should then start being interesting…

The first improvement in xine with Mercurial

So, after xine finally moved to Mercurial for xine-lib management, I’ve decided to start working on those things that required me to branch out, at least on some of them that is; the first one I was able to tackle down was ffmpeg_integration, that now works fine beside the dist target.

And then I moved working on the structures, applying pahole to all the structures in libxine, even those comprising the public ABI of the library, as I could just break the ABI when needed, rather than limiting to the local structures of the plugins. Some of these changes applied to structures that are not part of the public ABI, so I ended up merging them to the main repository already, and will be present in 1.1.5, even if they are mostly bytes-size changes, that nobody beside me should care about.

But then tonight I gone looking for the 32 buttons that are/were present as an inline array in one of the video overlay structures; I was going to change the inline array with a dynamic array or with an array of pointers, so that the memory was going to be used only when actually used..

It was a sour surprise to find out that the array was never used at all in the code, and it wasn’t used on frontends either, and by removing it, the size of the structure in which it was dropped from 86KB to 40 bytes.. and then the video_overlay_s structure dropped from over 4MB to about 3KB… finally with it removed, there was also a 10MB of memory usage cut down during xine-lib runtime playback, 13 of the memory usage when playing an mp3 file:

I’m sorry, this used to have images of massif graphs for xine before and after the change, but unfortunately they got lost.

For starters, this does seem quite good, don’t you think? :)

Conversion to Mercurial done

So, as of yesterday, thanks to Darren Salt and Reinhard Tartler, xine finally moved out of CVS dark age, and came into Mercurial’s territory.

As you can see on Debian’s Mercurial repositories page we have quite a few xine repositories: together with xine-lib, Darren also moved his gxine repositories so they are all together.

I’ve already started doing some little work, which shown a little bug on yesterday that was ignoring my commits; fortunately Micah corrected the problem right away.. thanks Micah! :)

One of the tasks I was able to take care of today was the update of the FFmpeg integration branch, that now is up to date with main xine-lib-1.1 branch. Unfortunately Binutils seems to have broken FFmpeg so I’m unable to finish building and thus checking if it actually works as intended.

Anyway, soon I’ll see to post more updates about it.

Pahole and xine-lib

So, I’ve taken two days totally off from almost any kind of communication, I tried to relax, and now I feel a bit better. I still do think I haven’t been able to do much good beside my work on Gentoo, but I’m not ready yet to give up, even if it’s hard to continue, I will continue. At least for a while.

Unfortunately my current UPS setup is not going to fly, X-Drum in #gentoo-it informed me last night of the incompatibility between PSUs with active PFC and UPSes with stepped sinusoidal wave.. and the new PSU I bought two months ago has active PFC. The result is that I need a new UPS, of the Smart UPS series from APC, which will cost me €420, sigh.

Talking about the topic in subject, two days ago I analysed most of the xine-lib plugins with pahole (with a patch to fix an integer overflow in the offset counter (the author wasn’t expecting structures bigger than 64KiB, but this in xine-lib is not rare), and I do have at least one good news: FFmpeg decoding of non-mpeg video streams was taking 1MB of memory for a libmpeg2 buffer that was not going to be used; I’ve now fixed this so that the structure is only allocated and initialised when needed, so decoding will take 1MB of memory less than before, on next xine-lib release.

Unfortunately, I’ve found similar mistakes in design in other structures, most of which are public, so part of the libxine ABI, and thus I can’t fix them in the 1.1 series, not unless there are good reasons and good results to achieve by breaking it. But, next week we’re going to move to Mercurial, thanks to Darren Salt and Reinhard Tartler who are helping with the migration (for who’s wondering about hosting, it will likely be on Alioth, if they accept us), so I can branch out and fix the stuff.. and then merge back in either 1.1 or 1.2 as we feel needed.

One of the structures I surely will be refactoring is the video overlay structure.. that has a size over 4MiB as it is, which explains why the function to initialise video overlay consumes 5MiB of memory right after xine initialisation, even when playing a sound file. By instancing the structures only when really needed, and making sure that there aren’t holes around, it should be possible to reduce drastically the memory used up by xine.

Another thing in my TODO list is, as I said already, rewriting the plugins cache code. I will also try to provide a simple way to regenerate this cache globally, so that for instance it can be installed directly by the ebuild, without asking users to regenerate the cache by theirselves, and sharing it in memory (through mmap) between users too.

To help this, I’ll also see to change the way the plugins are handled, by using where possible inline arrays for the names and description of the plugins, rather than pointers, allowing to share the structures in memory, where this does not waste too many bytes.

Anyway, I still need to relax a bit more because I can’t really rest lately, and I do need some rest if I am to carry on.

The distributed nightmare has ended!

Okay, tonight just a quick post. First of all, totally unrelated with the title (okay still distributed stuff, but not a nightware), GIT was discarded as option for xine-lib’s future Version Control System, we instead decided to go with Mercurial because of its availability on a wider range of platforms (GIT is not officially supported on Solaris at least, and Cogito does not work at 100% on FreeBSD); I also cleared up my previous doubt about it, both speed and features of the two softwares seems to be at the same level, so portability has been the breaking requirement. Darren converted the CVS already, creating a bunch of repository (due to branching), and it also got support for the user ID to author name translation (which means you’ll see the commit coming from Diego ‘Flameeyes’ Pettenò rather than from an anonymous dgp85).

With reference to my distributed nightmare, instead, I want to thank Mike (vapier) a TON, his last version of nfs-utils (1.0.12-r2) finally fixes my problem with FreeBSD clients, so I upgraded Farragut (breaking ServoFlame in the mean time I’m afraid) and now I’m putting Prakesh back into shape so I can start playing with Tomcat again.

So I don’t have to get crazy with Coda, nor I have to buy a new box right now; I’ll eventually do it, as I start to feel this box is slow, but for now it’s fine as it is without more money to go away; this is good because I can then buy the new Yu-Gi-Oh game as soon as it’s released ;)

Anyway, tomorrow it’s working day again, so I should be sleeping, the DST switch threw me off, and I was awake all last night with the jigsaw my sister gave me… 121×80, 3000 pieces.

Is it ALSA or HelLSA?

So, new kernel on the tree, gentoo-sources-2.6.19 is available for download, yes. And alsa-driver does not build, as 1.0.13 version is incompatible with the new kernel.

And while I already asked jcdutton about this, and said yes we’d get a new rc for ALSA when 2.6.19 will be released, there’s no version currently available to use for our tree.

For now users are forced to use –9999 ebuilds, that are live and thus unsupported, or use the code in the kernel, that seems not to be considered stable enough for a release candidate ….

I’m currently working against time to have a 1.0.14_pre20061130 to give you in the mean time, at least it will be a snapshot and not a live version, but I’m afraid it will be a hell to maintain once again.

Oh and of course hg (mercurial) segfaulted when I tried to update the sources to create the snapshot, on Enterprise. I did the clones with Farragut, but I didn’t like having it segfaulting.. really if you’re going to write an SCM in a scripting language for portability, just write it entirely that way, don’t mix it with C code and break stuff around in all the ways available. This reminds me why I use GIT.

And yes, tonight I was planning plenty of things to do, and all my plans are f*cked up by this ALSA thing… just because I’m again the only one maintaining it, just like PAM. I start to hate this loneliness..

Why documentation is important

So, although I said I wanted to take a break, there was one bug I wanted to smash as soon as I can, it’s bug #131456. This bug is annoying for users and I can understand why it is, upstream fixed it so one might think it wouldn’t be difficult to backport. Yeah sure.

The upstream bug report is #0002067, and only references the bug as “Fixed on Hg repo.”, thing that does not really says much to me. I know that Hg is mercurial, but that’s just a clue. A quick glance to ALSA website doesn’t show me much about that repository.

There’s a Development Wiki but no reference, the only reference I found, in the place I wasn’t thinking of, is in the Download page. Sigh. Now it’s time to check how to get the right patch out of hg, hopefully it won’t take that much time, but the checkout itself takes a bit.

I think I’ll have the ALSA Maintainer Guide to describe the Mercurial repository later today, and here goes all my break idea.

Okay okay, I can simply put that on my TODO list and do all tomorrow, but well.. I will probably forgot at this point :(