Bad branching, or a Quagga call for help

You might remember that for a while I worked on getting quagga in shape in Gentoo. The reason why I was doing that is that I needed quagga for the ADSL PCI modem I was using at home to work. Since right now I’m on the other side of the world, and my router decided to die, I’m probably going to stop maintaining Quagga altogether.

There are other reasons to be, which is probably why for a while we had a Quagga ebuild with invalid copyright headers (it was a contribution of somebody working somewhere, but over time it has been rewritten to the point it didn’t really made sense not to use our standard copyright header). From one side it’s the bad state of the documentation, which makes it very difficult to understand how to set up even the most obvious of the situations, but the main issue is the problem with the way the Quagga project is branching around.

So let’s take a step back and see one thing about Quagga: when I picked it up, there were two or three external patches configured by USE flags; these are usually very old and they are not included in the main Quagga sources. It’s not minimal patches either but they introduce major new functionality, and they are very intrusive (which is why they are not simply always included). This is probably due to the fact that Quagga is designed to be the routing daemon for Linux, with a number of possible protocol frontends connecting to the same backend (zebra). Over time instead of self-contained, easily out-of-date patches to implement new protocols, we started having whole new repositories (or at least branches) with said functionalities, thanks to the move to GIT, which makes it too easy to fork even if that’s not always a bad thing.

So now you get all these repositories with extra implementations, not all of which are compatible with one another, and most of which are not supported by upstream. Is that enough trouble? Not really. As I said before, Paul Jakma who’s the main developer of the project is of the idea that he doesn’t need a “stable” release, so he only makes release when he cares, and maintained that it’s the vendors’ task to maintain backports. On that spirit, some people started the Release Engineering for Quagga, but ….

When you think about a “Release Engineering” branch, you think of something akin to Greg’s stable kernel releases, so you get the latest version, and then you patch over it to make sure that it works fine, backporting the new features and fixes that hit master. Instead what happens here is that Quagga-RE forked off version 0.99.17 (we’re now to 0.99.21 on main, although Gentoo is still on .20 since I really can’t be bothered), and they are applying patches over that.

Okay so that’s still something, you get the backports from master on a known good revision is a good idea, isn’t it? Yes it would be a good idea if it wasn’t that … it’s actually new features applied over the old version! If you check you see that they have implemented a number of features in the RE branch which are not in master… to the result that you have a master that is neither a super-set nor a sub-set of the RE branch.

Add to this that some of the contributors of new code seems to have not clear what a license is and they cause discussion on the mailing list on the interpretation of the code’s license, and you can probably see why I don’t care about keeping this running, given I’m not using it in production anywhere.

Beforehand I was still caring about this knowing that Alin was using it, and he was co-maintaining it … but now that Alin has been retired, I’d be the sole maintainer of a piece of software that rarely works correctly, and is schizophrenic in its development, so I really don’t have extra time to spend on this.

So to finish this post with a very clear message: if you use Gentoo and rely on Quagga working for production usage, please step up now or it might just break without notice, as nobody’s caring for it! And if a Quagga maintainer reads this, please, please start making sense on your releases, I beg you.

BerliOS, and picking up “dead” projects

So, Tomáš also posted on the gentoo-dev mailing list about the BerliOS shut down that I have noted after forking unpaper which is/was hosted on that platform as well.

And yes, I do note the irony that I’m the one talking about forks, after what I wrote on the subject — but there are times when a fork is indeed necessary, at least to continue development on a project. And I should probably consider unpaper more a take over than a fork, given that the original developer seem to be unreachable (he hasn’t answered my mail yet).

Of course nobody expected unpaper to be the only project that was hosted on BerliOS, nor the only one dead. Indeed, back in the days when SF.net interface was obnoxious but still usable, BerliOS was considered a quite decent alternative, at least for the fact that there was Subversion support quite a bit of time before SourceForge supported anything other than CVS. Even I started not one, but two projects using BerliOS. One is the same unieject that I have now mostly abandoned and is available on Gitorious; the other was an Ultima OnLine server emulator, which was, really, my first try at coordinating a Free Software project.

Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. My projects previously hosted there are now hosted on GitHub.

Said project, was started by me and a handful of friends, some of whom were players on the same unofficial “shard” as me, while another was a fellow developer in another similar software (NoX-Wizard), was basically a from scratch implementation, in what at the time I considered modern C++ (it might even have been, considering that we just came out of the GCC 2.96 trouble). It was also my first encounter with Python used as a scripting environment within another software. The code was originally developed by me in CVS; then it was moved to SVN on a local repository, then again on BerliOS.. with the result that my commits actually showed up under a long series of names, d’oh!

Well, a couple of weeks ago I decided to import the code to GitHub — and with a bit of help from git svn I was able to also merge back my commits under a single name (and those of another developer as well, not under mine of course). It’s impressive how straightforward is to import a whole repository’s history nowadays. I remember going crazy to do the same thing at the time, when moving from CVS to SVN, and to import the local SVN to BerliOS.

This should actually be considered a stating point: indeed the fact that it’s relatively trivial to import a repository’s history nowadays should make it much easier to preserve the most important repositories of BerliOS itself — I just wonder if there’s hope to save all the content in BerliOS. This becomes quite interesting when you note that it comes not a long time after the Kernel.org KO, which has seen a number of projects migrate here and there, including Linux-PAM which seem to be maintained now on Fedora’s hardware (and in GIT, finally! — this means that the next time I’ll have to patch Linux-PAM, which I hope will be far into the future, I’ll be able to provide proper backports in Gentoo infrastructure, like I do for other packages including quagga).

Changing times?

The future of Unpaper

You might have read it already, or maybe you haven’t, but it looks like Berlios is going to close down at the end of the year. I wouldn’t go as far as calling it the end of an era, but we’re pretty close. Even I have a few projects that are (still) hosted on Berlios and I need to migrate.

Interestingly, Berlios is also the original hosting for unpaper which makes me forking it a couple of months ago a very good move. Especially since if I waited too long, the repository wouldn’t have been available.. even if it’s true that the repository didn’t really contain anything useful, as it was just an import of the 0.3 sources.

At any rate, since Jens didn’t reply to my inquiries, I’ve decided to start working on a more proper takeover of the project. I have created an unpaper project page on my website, to which I switched the live ebuild’s HOMEPAGE and GitHub’s website, as well as created an ohloh project to track the development.

Oh, and I released version 0.4. Yeah I guess this was the first thing to write about, but I wanted to make it less obvious.

The new release is basically simply the first cleanup I worked on, so new build system, no changes in parameters, man page and so on. Right after releasing 0.4 I merged the changes from the new contributor, Felix Janda, who not only took the time to break up the code in multiple source files, but also improved the blur filter.

Now, the next release is likely going to be 2.0; why skipping the 1.0 release? Well, it’s not really skipping. Before Berlios shuts down I wanted to copy down the previous list of downloads so I can mirror those as well, and what I found is that… the original version number series started with 1.0, so it’s mucked up badly; in Gentoo we have no problem since 0.3 was the only one we had, but for the sake of restoring consistence, the next version is going to be 2.0.

What is going to happen with that release? Well, for sure I want to rewrite the command-line options parsing. Right now it’s very custom, as long options are sometimes prefixed with one, sometimes with two dashes, and in general it doesn’t fit the usual unix command lines; not counting the fact that the parsing is done as you go with a series of strcmp() calls, which is not what one expects, usually. I intend to rewite this with getopt_long() but the problem with that is that it will break the command-line compatibility with unpaper 0.3, which is not something I’m happy about. But we’ve got to do that sooner or later if we want a more well-blent tool.

I hope to also be able to hook into the code a different way to load and save images, using some already present image decoding and encoding library, so that it can digest images in different formats than simple PNM. In particular, I’d like to be able to execute as a single pass the conversion from multiple files to a multi-page TIFF document, which requires quite a bit of work indeed. But one can dream, can’t I?

In the mean time, I hope to find some time this week to find a way to generate man pages on my server so that I can publish more complete documentation for both Ruby-Elf and unpaper itself. This is likely going to be difficult, since I’m starting some new tasks next week but.. you never know.

Forking unpaper, call for action

You might or might not know the tool by the name of unpaper that has been in Gentoo’s main tree for a while. If you don’t know it and you scan a lot, please go look at it now, it is sweet.

But sweet or not, the tool itself had quite a few shortcomings; one of these was recently brought to my attention as unsafe use of sprintf() that was fixed by upstream after 0.3 release, but which never arrived to a release.

When looking at fixing that one issue, I ended up deciding for a slightly more drastic approach: I forked the code, imported it to GitHub and started hacking at it. This both because the package lacked a build system, and because the tarball provided didn’t correspond with the sources on CVS (nor with those on SVN for what it’s worth).

For those who wonder why I got involved in this while this is obviously outside my usual interest area, I’m using unpaper almost daily on my paperless quest that is actually paying its fruits (my accountant is pleasantly surprised by how little time it takes to me to find the paperwork he needs). And if I can shave even fractions of seconds from a single unpaper process it can improve my workflow considerably.

What I have now in my repository is an almost identical version that has passed through some improvements: the build system is autotools (properly written), that works quite fine even for a single-source package, as it can find a couple of features that would otherwise be ignored. The code does not have the allocation waste that it did before, as I’ve removed a number of pointers to characters with preprocessor macros, and I started looking at a few strange things in the code.

For instance, it now no longer opens the file, seek to the end, then rewind to the start to find the file’s size, which was especially unhelpful since the variable where the file’s size was saved was never read from but the stdio calls have side effects, so the compiler couldn’t drop them by itself.

And when it is present, it will use sincosf() rather than calling sin() and cos() separately.

I also stopped the code from copying a string from a prebuilt table, and parse it at runtime to get the represented float value.. multiple times. This was mostly tied with the page size parsing, which I have basically rewritten, also avoiding looping twice over the two sizes with two separate loops. Duh!

I also originally overlooked the fact that the repository had some pre-defined self-tests that were never packaged and thus couldn’t be used for custom builds before; this is also fixed now, and make check runs the tests just fine. Unfortunately what this does not do is comparing the output with some known-good output, I need an image compare tool to do so; for now it only ensures that unpaper behaves as expected with the commandline it is provided, better than nothing.

At any rate this is obviously only the beginning: there are bugs open on the berlios project page that I should probably look into fixing, and I have already started writing a list of TODO tasks that should be taken care of at some point or another. If you’re interested in helping out, please clone the repository and see what you can do. Testing is also very much appreciated.

I haven’t decided when to do a release, for now I’m hoping that Jens will join the fork and publish the releases on berlios based on the autotools build system. There’s a live ebuild in main tree for testing (app-text/unpaper-9999), so I’d be grateful if you could try it on a few different systems. Please enable FEATURES=test for it so that if something breaks we’ll know son enough. If you’re a maintainer packaging unpaper on other distributions, feel free to get in touch with me and tell me if you’ve other patches to provide (I should mail the Debian maintainer at this point I guess).