We need Free Software Co-operatives, but we probably won’t get any

The recent GitHub craze that got a number of Free Software fundamentalists to hurry away from GitHub towards other hosting solutions.

Whether it was GitLab (a fairly natural choice given the nature of the two services), BitBucket, or SourceForge (which is trying to rebuild a reputation as a Free Software friendly hosting company), there are a number of options of new SaaS providers.

At the same time, a number of projects have been boasting (and maybe a bit too smugly, in my opinion) that they self-host their own GitLab or similar software, and suggested other projects to do the same to be “really free”.

A lot of the discourse appears to be missing nuance on the compromises that using SaaS hosting providers, self-hosting for communities and self-hosting for single projects, and so I thought I would gather my thoughts around this in one single post.

First of all, you probably remember my thoughts on self-hosting in general. Any solution that involves self-hosting will require a significant amount of ongoing work. You need to make sure your services keep working, and keep safe and secure. Particularly for FLOSS source code hosting, it’s of primary importance that the integrity and safety of the source code is maintained.

As I already said in the previous post, this style of hosting works well for projects that have a community, in which one or more dedicated people can look after the services. And in particular for bigger communities, such as KDE, GNOME, FreeDesktop, and so on, this is a very effective way to keep stewardship of code and community.

But for one-person projects, such as unpaper or glucometerutils, self-hosting would be quite bad. Even for xine with a single person maintaining just site+bugzilla it got fairly bad. I’m trying to convince the remaining active maintainers to migrate this to VideoLAN, which is now probably the biggest Free Software multimedia project and community.

This is not a new problem. Indeed, before people rushed in to GitHub (or Gitorious), they rushed in to other services that provided similar integrated environments. When I became a FLOSS developer, the biggest of them was SourceForge — which, as I noted earlier, was recently bought by a company trying to rebuild its reputation after a significant loss of trust. These environments don’t only include SCM services, but also issue (bug) trackers, contact email and so on so forth.

Using one of these services is always a compromise: not only they require an account on each service to be able to interact with them, but they also have a level of lock-in, simply because of the nature of URLs. Indeed, as I wrote last year, just going through my old blog posts to identify those referencing dead links had reminded me of just how many project hosting services shut down, sometimes dragging along (Berlios) and sometimes abruptly (RubyForge).

This is a problem that does not only involve services provided by for-profit companies. Sunsite, RubyForge and Berlios didn’t really have companies behind, and that last one is probably one of the closest things to a Free Software co-operative that I’ve seen outside of FSF and friends.

There is of course Savannah, FSF’s own Forge-lookalike system. Unfortunately for one reason or another it has always lagged behind the featureset (particularly around security) of other project management SaaS. My personal guess is that it is due to the political nature of hosting any project over on FSF’s infrastructure, even outside of the GNU project.

So what we need would be a politically-neutral, project-agnostic hosting platform that is a co-operative effort. Unfortunately, I don’t see that happening any time soon. The main problem is that project hosting is expensive, whether you use dedicated servers or cloud providers. And it takes full time people to work as system administrators to keep it running smoothly and security. You need professionals, too — or you may end up like lkml.org being down when its one maintainer goes on vacation and something happens.

While there are projects that receive enough donations that they would be able to sustain these costs (see KDE, GNOME, VideoLAN), I’d be skeptical that there would be an unfocused co-operative that would be able to take care of this. Particularly if it does not restrict creation of new projects and repositories, as that requires particular attention to abuse, and to make good guidelines of which content is welcome and which one isn’t.

If you think that that’s an easy task, consider that even SourceForge, with their review process, that used to take a significant amount of time, managed to let joke projects use their service and run on their credentials.

A few years ago, I would have said that SFLC, SFC and SPI would be the right actors to set up something like this. Nowadays? Given their infights I don’t expect them being any useful.

The importance of teams, and teamwork

Today, on Twitter, I have received a reply with a phrase that, in its own sake and without connecting back with the original topic of the thread, I found significant of the dread I feel with working as a developer, particularly in many opensource communities nowadays.

Most things don’t work the way I think they work. That’s why I’m a programmer, so I can make them work the way I think they should work.

I’m not going to link back to the tweet, or name the author of the phrase. This is not about them in particular, and more about the feeling expressed in this phrase, which I would have agreed with many years ago, but now feels so much off key.

What I feel now is that programmers don’t make things work the way they think they should. And this is not intended as a nod to the various jokes about how bad programming actually is, given APIs and constraints. This is about something that becomes clear when you spend your time trying to change the world, or make a living alone (by running your own company): everybody needs help, in the form of a team.

A lone programmer may be able to write a whole operating system (cough Emacs), but that does not make it a success in and by itself. If you plan on changing the world, and possibly changing it for the better, you need a team that includes not only programmers, but experts in quite a lot of different things.

Whether it is a Free Software project, or a commercial product, if you want to have users, you need to know what they want — and a programmer is not always the most suitable person to go through user stories. Hands up all of us who have, at one point or another, facepalmed at an acquaintance taking a screenshot of a web page to paste it into Word, and tried to teach them how to print the page to PDF. While changing workflows so that they make sense may sound the easiest solution to most tech people, that’s not what people who are trying to just do their job care about. Particularly not if you’re trying to sell them (literally or figuratively) a new product.

And similarly to what users want to do, you need to know what the users need to do. While effectively all of Free Software comes with no warranty attached, even for it (and most definitely for commercial products), it’s important to consider the legal framework the software has to be used on. Except for the more anarchists of the developers out there, I don’t think anyone would feel particularly interested in breaching laws for the sake of breaching them, for instance by providing a ledger product that allows “black book accounting” as an encrypted parallel file. Or, to reprise my recent example, to provide a software solution that does not comply with GDPR.

This is not just about pure software products. You may remember, from last year, the teardown of Juicero. In this case the problems appeared to step by the lack of control over the BOM. While electronics is by far not my speciality, I have heard more expert friends and colleagues cringe at seeing the spec of projects that tried to actually become mainstream, with a BOM easily twice as expensive as the minimum.

Aside here, before someone starts shouting about that. Minimising the BOM for an electronic project may not always be the main target. If it’s a DIY project, making it easier to assemble could be an objective, so choosing more bulky, more expensive parts might be warranted. Similarly if it’s being done for prototyping, using more expensive but widely available components is generally a win too. I have worked on devices that used multi-GB SSDs for a firmware less than 64MB — but asking for on-board flash for the firmware would have costed more than the extremely overprovisioned SSDs.

And in my opinion, if you want to have your own company, and are in for the long run (i.e. not with startup mentality of getting VC capital and get acquired before even shipping), you definitely need someone to follow up the business plan and the accounting.

So no, I don’t think that any one programmer, or a group of sole programmers, can change the world. There’s a lot more than writing code, to build software. And a lot more than building software, to change society.

Consider this the reason why I will plonk-file any recruitment email that is looking for “rockstars” or “ninjas”. Not that I’m looking for a new gig as I type this, but I would at least give thought if someone was looking for a software mechanic (h/t @sysadmin1138).

Project Memory

For a series of events that I’m not entirely sure the start of, I started fixing some links on my old blog posts, fixing some links that got broken, most of which I ended up having to find on the Wayback Archive. While looking at that, I ended up finding some content for my very old blog. A one that was hosted on Blogspot, written in Italian only, and seriously written by an annoying brat that needed to learn something about life. Which I did, of course. The story of that blog is for a different time and post, for now I’ll actually focus on a different topic.

When I started looking at this, I ended up going through a lot of my blogposts and updated a number of broken links, either by linking to the Wayback machine or by removing the link. I focused on those links that can easily be grepped for, which turns out to be a very good side effect of having migrated to Hugo.

This meant, among other things, removing references to identi.ca (which came to my mind because of all the hype I hear about Mastodon nowadays), removing links to my old “Hire me!” page, and so on. And that’s where things started showing a pattern.

I ended up updating or removing links to Berlios, Rubyforge, Gitorious, Google Code, Gemcutter, and so on.

For many of these, turned out I don’t even have a local copy (at hand at least) of some of my smaller projects (mostly the early Ruby stuff I’ve done). But I almost certainly have some of that data in my backups, some of which I actually have in Dublin and want to start digging into at some point soon. Again, this is a story for a different time.

The importance to the links to those project management websites is that for many projects, those pages were all that you had about them. And for some of those, all the important information was captured by those services.

Back when I started contributing to free software projects, SourceForge was the badge of honor of being an actual project: it would give you space to host a website, as well as the ability to have source code repositories and websites. And this was the era before Git, before Mercurial, and the other DVCS, which means either you had SourceForge, or you likely had no source control at all. But SourceForge admins also reviewed (or at least alleged to) every project that was created, and so creating a project on the platform was not straightforward, you would do that only if you really had the time to invest on the project.

A few projects were big enough to have their own servers, and a few projects were hosted in some other “random” project management sites, that for a while appeared to sprout out because the Forge software used by SourceForge was (for a while at least) released as free software itself. Some of those websites were specific in nature, others more general. Over time, BerliOS appeared to become the anti-SourceForge, with a streamlined application process, and most importantly, with Subversion years before SF would gain support for it.

Things got a bit more interesting later, when things like Bazaar, Mercurial, GIT and so on started appearing on the horizon, because at that point having proper source control could be achieved without needing special servers (without publishing it at least, although there were ways around that. Which at the same time made some project management website redundant, and others more appealable.

But, let’s take a look at the list of project management websites I have used and are now completely or partly gone, with or without history:

  • The aforementioned BerliOS, which has been teetering back and forth a couple of times. I had a couple of projects over there, which I ended up importing to GitHub, and I also forked unpaper. The service and the hosting were taken down in 2014, but (all?) the projects hosted on the platform were mirrored on SourceForge. As far as I can tell they were mirrored read-only, so for instance I can’t de-duplicate the unieject projects since I originally wrote it on SourceForge and then migrated to BerliOS.

  • The Danish SunSITE, which hosted a number of open-source projects for reasons that I’m not entirely clear on. NoX-Wizard, an open-source Ultima OnLine server emulator was hosted there, for reasons that are even murkier to me. The site got renamed to dotsrc.org, but they dropped all the hosting services in 2009. I can’t seem to find an archive of their data; Nox-Wizard was migrated during my time to SourceForge, so that’s okay by me.

  • RubyForge used that same Forge app as SourceForge, and was focused on Ruby module development. It was abruptly terminated in 2014, and as it turns out I made the mistake here of not importing my few modules explicitly. I should have them in my backups if I start looking for them, I just haven’t done so yet.

  • Gitorious set itself up as being an open, free software software to compete with GitHub. Unfortunately it clearly was not profitable enough and it got acquired, twice. The second time by competing service GitLab, that had no interest in running the software. A brittle mirror of the project repositories only (no user pages) is still online, thanks to Archive Team. I originally used Gitorious for my repositories rather than GitHub, but I came around to it and moved everything over before they shut the service down, well almost everything, as it turns out some of the LScube repos were not saved, because they were only mirrors… except that the domain for that project expired so we lost access to the main website and GIT repository, too.

  • Google Code was Google’s project hosting service, that started by offering Subversion repositories, downloads, issue trackers and so on. Very few of the projects I tracked used Google Code to begin with, and it was finally turned down in 2015, by making all projects read-only except for setting up a redirection to a new homepage. The biggest project I followed from Google Code was libarchive and they migrated it fully to GitHub, including migrating the issues.

  • Gemcutter used to be a repository for Ruby gems packages. I actually forgot why it was started, but it was for a while the alternative repository where a lot of cool kids stored their libraries. Gemcutter got merged back into rubygems.org and the old links now appear to redirect to the right pages. Yay!

With such a list of project hosting websites going the way of the dodo, an obvious conclusion to take is that hosting things on your own servers is the way to go. I would still argue otherwise. Despite the amount of hosting websites going away, it feels to me like the vast majority of the information we lost over the past 13 years is for blogs and personal websites badly used for documentation. With the exception of RubyForge, all the above examples were properly archived one way or the other, so at least the majority of the historical memory is not gone at all.

Not using project hosting websites is obviously an option. Unfortunately it comes with the usual problems and with even higher risks of losing data. Even GitLab’s snafu had higher chances to be fixed than whatever your one-person-project has when the owner gets tired, runs out of money, graduate from university, or even dies.

So what can we do to make things more resilient to disappearing? Let me suggest a few points of actions, which I think are relevant and possible right now to make things better for everybody.

First of all, let’s all make sure that the Internet Archive by donating. I set up a €5/month donation which gets matched by my employer. The Archive not only provides the Wayback Machine, which is how I can still fetch some of the content both from my past blogs and from blogs of people who deleted or moved them, or even passed on. Internet is our history, we can’t let it disappear without effort.

Then for what concerns the projects themselves, it may be a bit less clear cut. The first thing I’ll be much more wary about in the future is relying on the support sites when writing comments or commit messages. Issue trackers get lost, or renumbered, and so the references to those may be broken too easily. Be verbose in your commit messages, and if needed provide a quoted issue, instead of just “Fix issue #1123”.

Even mailing lists are not safe. While Gmane is currently supposedly still online, most of the gmane links from my own blog are broken, and I need to find replacements for them.

This brings me to the following problem: documentation. Wikis made documenting things significantly cheaper as you don’t need o learn lots neither in form of syntax nor in form of process. Unfortunately, backing up wikis is not easy because a database is involved, and it’s very hard, when taking over a project whose maintainers are unresponsive, to find a good way to import the wiki. GitHub makes thing easier thanks to their GitHub pages, and it’s at least a starting point. Unfortunately it makes the process a little more messy than the wiki, but we can survive that, I’m sure.

Myself, I decided to use a hybrid approach. Given that some of my projects such as unieject managed to migrate from SourceForge to BerliOS, to Gitorious, to GitHub, I have now set up a number of redirects on my website, so that their official websites will read https://www.flameeyes.eu/p/glucometerutils and it’ll redirect to wherever I’m hosting them at the time.

Bad branching, or a Quagga call for help

You might remember that for a while I worked on getting quagga in shape in Gentoo. The reason why I was doing that is that I needed quagga for the ADSL PCI modem I was using at home to work. Since right now I’m on the other side of the world, and my router decided to die, I’m probably going to stop maintaining Quagga altogether.

There are other reasons to be, which is probably why for a while we had a Quagga ebuild with invalid copyright headers (it was a contribution of somebody working somewhere, but over time it has been rewritten to the point it didn’t really made sense not to use our standard copyright header). From one side it’s the bad state of the documentation, which makes it very difficult to understand how to set up even the most obvious of the situations, but the main issue is the problem with the way the Quagga project is branching around.

So let’s take a step back and see one thing about Quagga: when I picked it up, there were two or three external patches configured by USE flags; these are usually very old and they are not included in the main Quagga sources. It’s not minimal patches either but they introduce major new functionality, and they are very intrusive (which is why they are not simply always included). This is probably due to the fact that Quagga is designed to be the routing daemon for Linux, with a number of possible protocol frontends connecting to the same backend (zebra). Over time instead of self-contained, easily out-of-date patches to implement new protocols, we started having whole new repositories (or at least branches) with said functionalities, thanks to the move to GIT, which makes it too easy to fork even if that’s not always a bad thing.

So now you get all these repositories with extra implementations, not all of which are compatible with one another, and most of which are not supported by upstream. Is that enough trouble? Not really. As I said before, Paul Jakma who’s the main developer of the project is of the idea that he doesn’t need a “stable” release, so he only makes release when he cares, and maintained that it’s the vendors’ task to maintain backports. On that spirit, some people started the Release Engineering for Quagga, but ….

When you think about a “Release Engineering” branch, you think of something akin to Greg’s stable kernel releases, so you get the latest version, and then you patch over it to make sure that it works fine, backporting the new features and fixes that hit master. Instead what happens here is that Quagga-RE forked off version 0.99.17 (we’re now to 0.99.21 on main, although Gentoo is still on .20 since I really can’t be bothered), and they are applying patches over that.

Okay so that’s still something, you get the backports from master on a known good revision is a good idea, isn’t it? Yes it would be a good idea if it wasn’t that … it’s actually new features applied over the old version! If you check you see that they have implemented a number of features in the RE branch which are not in master… to the result that you have a master that is neither a super-set nor a sub-set of the RE branch.

Add to this that some of the contributors of new code seems to have not clear what a license is and they cause discussion on the mailing list on the interpretation of the code’s license, and you can probably see why I don’t care about keeping this running, given I’m not using it in production anywhere.

Beforehand I was still caring about this knowing that Alin was using it, and he was co-maintaining it … but now that Alin has been retired, I’d be the sole maintainer of a piece of software that rarely works correctly, and is schizophrenic in its development, so I really don’t have extra time to spend on this.

So to finish this post with a very clear message: if you use Gentoo and rely on Quagga working for production usage, please step up now or it might just break without notice, as nobody’s caring for it! And if a Quagga maintainer reads this, please, please start making sense on your releases, I beg you.