Planets, Clouds, Python

Half a year ago, I wrote some thoughts about writing a cloud-native feed aggregator. I actually started drawing some ideas of how I would design this myself since, and I even went through the (limited) trouble of having it approved for release. But I have not actually released any code, or to be honest, I have not written any code either. The repository has been sitting idle.

Now, with the Python 2 demise coming soon, and me not interested in keeping around a server nearly only to run Planet Multimedia, I started looking into this again. The first thing that I realized is that I both want to reuse as much code exist out there as I can, and I want to integrate with “modern” professional technologies such as OpenTelemetry, which I appreciate from work, even if it sounds like overkill.

But that’s where things get complicated: while going full “left-pad” of having a module for literally everything is not something you’ll find me happy about, a quick look at feedparser, probably the most common module to read feeds in Python, shows just how much code is spent trying to cover for old Python versions (before 2.7, even), or to implement minimal-viable-interfaces to avoid mandatory dependencies at all.

Thankfully, as Samuel from NewsBlur pointed out, it’s relatively trivial to just fetch the feed with requests, and then pass it down to feedparser. And since there are integration points for OpenTelemetry and requests, having an instrumented feed fetcher shouldn’t be too hard. That’s going to probably be my first focus when writing Tanuga, next weekend.

Speaking of NewsBlur, the chat with Samuel also made me realize how much of it is still tied to Python 2. Since I’ve gathered quite a bit of experience in porting to Python 3 at work, I’m trying to find some personal time to contribute smaller fixes to run this in Python 3. The biggest hurdle I’m having right now is to set it up on a VM so that I can start it up in Python 2 to begin with.

Why am I back looking at this pseudo-actively? Well, the main reason is that rawdog is still using Python 2, and that is going to be a major pain with security next year. But it’s also the last non-static website that I run on my own infrastructure, and I really would love to get rid of entirely. Once I do that, I can at least stop running my own (dedicated or virtual) servers. And that’s going to save me time (and money, but time is the most important one here too.)

My hope is that once I find a good solution to migrate Planet Multimedia to a Cloud solution, I can move the remaining static websites to other solutions, likely Netlify like I did for my photography page. And after that, I can stop the last remaining server, and be done with sysadmin work outside of my flat. Because honestly, it’s not worth my time to run all of these.

I can already hear a few folks complaining with the usual remarks of “it’s someone else’s computer!” — but the answer is that yes, it’s someone else’s computer, but a computer of someone who’s paid to do a good job with it. This is possibly the only way for me to manage to cut away some time to work on more Open Source software.

Boot-to-Kodi, 2019 edition

This weekend I’m oncall for work, so between me and my girlfriend we decided to take a few chores off our to-do lists. One of the things for me was to run the now episodic maintenance over the software and firmware of the devices we own at home. I call it episodic, because I no longer spend every evening looking after servers, whether at home or remote, but rather look at them when I need to.

In this case, I honestly forgot when it was the last time that I ran updates on the HTPC I use for Kodi and for the UniFi controller software. And that meant that after the full update I reached the now not uncommon situation that Kodi refused to start at boot. Or even when SSH’ing into the machine and starting the service by hand.

The error message, for ease of Googling, is:

[  2092.606] (EE) 
Fatal server error:
[  2092.606] (EE) xf86OpenConsole: Cannot open virtual console 7 (Permission denied)

What happens in this case is that the method I have been using to boot-to-Kodi was to use a systemd unit lifted from Arch Linux, that started a new session, X11, and Kodi all at once. This has stopped working now, because Xorg no longer can access the TTY, because systemd does not think it should access the console.

There supposedly are ways to convince systemd that it should let the user run X11 without so much fluff, but after an hour trying a number of different combinations I was not getting anywhere. I finally found one way to do it, and that’s what I’m documenting here: use lightdm.

I have found a number of different blog posts out there that try to describe how to do this, but none of them appear to apply directly to Gentoo.

These are the packages that would be merged, in order: 
 
Calculating dependencies... done! 
[ebuild   R    ] x11-misc/lightdm-1.26.0-r1::gentoo  USE="introspection -audit -gnome -gtk -qt5 -vala" 0 KiB

You don’t need Gtk, Qt or GNOME support for lightdm to work. But if you install it this way (which I’m surprised is allowed, even by Gentoo) it will fail to start! To configure what you need, you would have to manually write this to /etc/lightdm/lightdm.conf:

[Seat:*] 
autologin-user=xbmc 
user-session=kodi 
session-wrapper=/etc/lightdm/Xsession

In this case, my user is called xbmc (this HTPC was set up well before the rename), and this effectively turns lightdm into a bridge from systemd to Kodi. The kodi session is installed by the media-tv/kodi package, so there’s no other configuration needed. It just… worked.

I know that some people would find the ability to do this kind of customization via “simple” text files empowering. For me it’s just a huge waste of time, and I’m not sure why there isn’t just an obvious way for systemd and Kodi to get along. I would hope somebody builds one in the future, but for now I guess I’ll leave with that.

I’m told Rust is great, where are the graphics libraries?

While I’m still a bit sour that Mozilla decided to use the same name for their language as an old project of mine (which is not a new thing for Mozilla anyway, if someone remembers the days of Phoenix and Firebird), I have been looking from the sideline as the Rust language as a way forward to replace so many applications of embedded C, with a significantly safer alternative.

I have indeed been happy to see so much UEFI work happening in Rust, because it seems to me like we came far enough that we can sacrifice some of the extreme performance of C for some safety.

But one thing that I still have not seen is a good selection of graphics libraries, and that is something that I’m fairly disappointed by. Indeed, I have been told that there are Rust bindings for the classic C graphics libraries — which is pointless, as then the part that needs safety (the parsing) is still performed in C!

The reason why I’m angry about this is that I still have one project, unpaper, which I inherited as a big chunk of C and could definitely be rewritten into a safer language. But I would rather not do so in a higher level language like Python due to the already slow floating point calculations and huge memory usage.

Right now, unpaper is using libav, or ffmpeg, or something with their interface, depending on how much they fought this year. This is painful, but given that each graphic library implements interfaces in different ways, I couldn’t find a better and safe way to implement graphics processing. I was hoping that with all the focus on Rust out there, particularly from Mozilla, implementing graphics parsing libraries would be high in the list of priorities.

I think it’s librsvg that was ported to Rust — which was probably a great idea to prioritize, given it is exactly the type of format where C performs very poorly: string parsing. But I’m surprised nobody tried to make an API-compatible libpng or libtiff. It sounds to me like Rust is the perfect language for this type of work.

At any rate, if anyone finally decides to implement a generic graphic file input/output library, with at least support for TIFF, PNM and PNG, I’d love to know. And after that I would be happy to port unpaper to it — or if someone wants to take unpaper code as the basis to reimplement it as a proof of concept, that’d be awesome.

The problem for a lot of these libraries is that you have to maintain support for a long list of quirks and extensions that over time piled up on the formats. And while you can easily write tests to maintain bit-wise compatibility with the “original” C language based libraries for what concerns bitmap rendering (even for non-bitmap graphics such as JPEG and WebP), there are more things that are not obvious to implement, such as colour profiles, and metadata in general.

Actually, I think that there is a lot of space here to build up a standard set of libraries for graphics libraries and metadata, since there’s at least some overlapping between these, and having a bigger group of people working on separate, but API-similar libraries for various graphic formats would be a significant advantage for Rust over other languages.

Dear Amazon, please kill the ComiXology app

Dear Amazon, Dear Comixology,

Today, May the 4th, is Free Comic Book Day. I thought this was the right time to issue a plea to you: it’s due time you get rid of the ComiXology app, and ask your customers to just use an unified Kindle app to read their comic books.

I love comic books, and I found ComiXology an awesome service, with awesome selection and good prices. I have been a customer for many years, starting from when I just had bought an iPad for a job. For a while, I have signed up for their ComiXology Unlimited service, that for a monthly fee gave you access to an astounding amount of comics — particularly a lot of non-mainstream comics, a great way to discover some interesting independent authors.

When Amazon bought ComiXology, I was at the same time pleased and afraid — pleased because that could have (and did) boost ComiXology’s reach, afraid because there was always a significant overlap with the Kindle app, ecosystem and market. And it turned out that my fears were just as real, as I found out last year.

I don’t want to repeat the specifics here, the short version is that the ComiXology app has been broken for over a year now for any Android user that relies on microSD storage rather than the internal storage, such as mine. After multiple denial from ComiXology support, the blog post helped me get this to the attention of at least one engineer on the team, who actually sent me a reply nearly 11 months ago:

I followed up with our team and a few weeks ago we met about your report. We realized you are 100% correct, and we’re re-evaluating our decision RE adoptable storage. I don’t have news on when that answer is coming, but the topic is open internally and I want to thank you for your detailed emails and notes. Hopefully we can figure this out and get you back.

Matt, ComiXology Support, May 30, 2018

Unfortunately, months passed, and no changes were pushed to the app. The tablet got an Android OS update, ComiXology got updates every few months, but the app to this day has any way to store its content on microSD cards. The last contact I have from support is from last summer:

Our team has tracked down what’s going on and you are correct in your analysis. They are working on a solution, though we do not have an estimate for when you will be seeing it. We will keep on checking in on this and making sure things move along.

Erin, ComiXology Support, August 13th, 2018

This is not just a simple annoyance. There is a workaround, that involves using the microSD as so-called “portable storage”, and telling the app to store the comics on the SD card itself. But it has another side effect: you can’t then use the SD card to download Netflix content. The Netflix app cannot be moved to the card, either as adopted or portable storage – just like ComiXolgy – but it supports selecting an “adopted storage” microSD card for storage, and actually defaults to it. So you end up choosing between Netflix and ComiXology.

And here’s the kicker: the Kindle app, developed by a different branch of the same company, does this the right way.

And this brings me back to the topic of this post: the Kindle app is not stellr for reading comic books in my experience, ComiXology did a much better job at navigating panels. But that’s where it stops — Kindle has a better library handling, a better background download support, and clearly better support for modern Android OS. But I can’t read the content I already paid for in ComiXology on that.

I think the best value for the customers, for the people actually reading the comic books, would be if Amazon just stopped investing engineering into the ComiXology app at this point, which clearly appears understaffed and not making any forward progress anyway, and instead allowed reading of ComiXology content on Kindle apps. And maybe Kindle hardware — I would love reading my manga collection on a Kindle, even if I had to upgrade from my Paperwhite (but please, if you require me to do that, use USB-C for the next gen!)

Will you, Amazon?

“Planets” in the World of Cloud

As I have written recently, I’m trying to reduce the amount of servers I directly manage, as it’s getting annoying and, honestly, out of touch with what my peers are doing right now. I already hired another company to run the blog for me, although I do keep access to all its information at hand and can migrate where needed. I also give it a try to use Firebase Hosting for my tiny photography page, to see if it would be feasible to replace my homepage with that.

But one of the things that I still definitely need a server for is keep running Planet Multimedia, despite its tiny userbase and dwindling content (if you work in FLOSS multimedia, and you want to be added to the Planet, drop me an email!)

Right now, the Planet is maintained through rawdog, which is a Python script that works locally with no database. This is great to run on a vserver, but in a word where most of the investments and improvements go on Cloud services, that’s not really viable as an option. And to be honest, the fact that this is still using Python 2 worries me no little, particularly when the author insists that Python 3 is a different language (it isn’t).

So, I’m now in the market to replace the Planet Multimedia backend with something that is “Cloud native” — that is, designed to be run on some cloud, and possibly lightweight. I don’t really want to start dealing with Kubernetes, running my own PostgreSQL instances, or setting up Apache. I really would like something that looks more like the redirector I blogged about before, or like the stuff I deal with for a living at work. Because it is 2019.

So sketching this “on paper” very roughly, I expect such a software to be along the lines of a single binary with a configuration file, that outputs static files that are served by the web server. Kind of like rawdog, but long-running. Changing the configuration would require restarting the binary, but that’s acceptable. No database access is really needed, as caching can be maintained to process level — although that would men that permanent redirects couldn’t be rewritten in the configuration. So maybe some configuration database would help, but it seems most clouds support some simple unstructured data storage that would solve that particular problem.

From experience with work, I would expect the long running binary to be itself a webapp, so that you can either inspect (read-only) what’s going on, or make changes to the database configuration with it. And it should probably have independent parallel execution of fetchers for the various feeds, that then store the received content into a shared (in-memory only) structure, that is used by the generation routine to produce the output files. It may sounds like over-engineering the problem, but that’s a bit of a given for me, nowadays.

To be fair, the part that makes me more uneasy of all is authentication, but Identity-Aware Proxy might be a good solution for this. I have not looked into that but used something similar at work.

I’m explicitly ignoring the serving-side problem: serving static files is a problem that has mostly been solved, and I think all cloud providers have some service that allows you to do that.

I’m not sure if I will be able to work more on this, rather than just providing a sketched-out idea. If anyone knows of something like this already, or feels like giving a try to building this, I’d be happy to help (employer-permitting of course). Otherwise, if I find some time to builds stuff like this, I’ll try to get it released as open-source, to build upon.

Why do we still use Ghostscript?

Late last year, I have had a bit of a Twitter discussion on the fact that I can’t think of a good reason why Ghostscript is still a standard part of the Free Software desktop environment. The comments started from a security issue related to file-access from within a PostScript program (i.e. a .ps file), and at the time I actually started drafting some of the content that is becoming this post now. I then shelved most of it because I’ve been busy and it was not topical.

Then Tavis had to bring this back to the attention of the public, and so I’m back writing this.

To be able to answer the question I pose in the title we have to first define what Ghostscript is — and the short answer is, a PostScript renderer. Of course it’s a lot more than just that, but for the most part, that’s what it is. It deals with PostScript programs (or documents, if you prefer), and renders them into different formats. PostScript is rarely if at all use in modern desktops — not just because it’s overly complicated, but because it’s just not that useful in a world that mostly settled in PDF, which is essentially a “compiled PostScript”.

Okay not quite. There are plenty of qualifications that go around that whole paragraph, but I think it matches the practicalities of the case fairly well.

PostScript has found a number of interesting niche uses though, a lot of which focus around printing, because PostScript is the language that older (early?) printers used. I have not seen any modern printers speak PostScript though, at least after my Kyocera FS-1020, and even those who do, tend to support alternative “languages” and raster formats. On the other hand, because PostScript was a “lingua franca” for printers, CUPS and other printer-related tooling still use PostScript as an intermediate language.

In a similar fashion, quite a few software that deal with faxes (yes, faxes), tend to make use of Ghostscript itself. I would know because I wrote one, under contract, a long time ago. The reason is frankly pragmatic: if you’re on the client side, you want Windows to “print to fax”, and having a virtual PostScript printer is very easy — at that point you want to convert the document into something that can be easily shoved down the fax software throat, which ends up being TIFF (because TIFF is, as I understand it, the closest encoding to the physical faxes). And Ghostscript is very good at doing that.

Indeed, I have used (and seen used) Ghostscript in many cases to basically combine a bunch of images into a single document, usually in TIFF or PDF format. It’s very good at doing that, if you know how to use it, or you copy-paste from other people’s implementation.

Often, this is done through the command line, too, the reason for which is to be found in the licenses used by various Ghostscript implementations and versions over time. Indeed, while many people think of Ghostscript as an open source Swiss Army Knife of document processing, it actually is dual-licensed. The Wikipedia page for the project shows eight variant, with at least four different licenses over time. The current options are AGPLv3 or the commercial paid-for license — and I can tell you that a lot of people (including the folks I worked under contract for), don’t really want to pay for that license, preferring instead the “arms’ length” aggregation of calling the binary rather than linking it in. Indeed, I wrote a .NET Library to do just that. It’s optimized for (you guessed it right) TIFF files, because it was a component of an Internet Fax implementation.

So where does this leave us?

Back ten years ago or so, when effectively every Free Software desktop PDF viewer was effectively forking the XPDF source code to adapt it to whatever rendering engine they needed, it took a significant list of vulnerabilities that needed to be fixed time and time again for the Poppler project to take off, and create One PDF Rendering To Rule Them All. I think we need the same for Ghostscript. With a few differences.

The first difference is that I think we need to take a good look at what Ghostscript, and Postscript, are useful for in today’s desktops. Combining multiple images in a single document should _not_ require processing all the way to PostScript. There’s no reason to! Particularly not when the images are just JPEG files, and PDF can embed them directly. Having a tool that is good at combining multiple images into a PDF, with decent options for page size and alignment, would probably replace many of the usages of Ghostscript that I had in my own tools and scripts over the past few years.

And while rendering PostScript for either display or print are similar enough tasks, I have some doubt the same code would work right for both. PostScript and Ghostscript are often used in _networked_ printing as well. In which case there’s a lot of processing of untrusted input — both for display and printing. Sandboxing – and possibly writing this in a language better suited to deal with untrusted input than C is – would go a long way to prevent problems there.

But there are a few other interesting topics that I want to point out on this. I can’t think of any good reason for _desktops_ to support PostScript out of the box in 2019. While I can still think of a lot of tools, particularly from the old timers, that use PostScript as an intermediate format, most people _in the world_ would use PDF nowadays to share documents, not PostScript. It’s kind of like sharing DVI files — which I have done before, but I now wonder why. While both formats might have advantages over PDF, in 2019 they definitely lost the format war. macOs might still support both (I don’t know), but Windows and Android definitely don’t, which make them pretty useless to share knowledge with the world.

What I mean with that is that it’s probably due time that PostScript becomes an _optional_ component of the Free Software Desktop, one that the users need to enable explicitly _if they ever need it_, just to limit the risks that accepting, displaying and thumbnailing full, Turing-complete programs masqueraded as documents. Even Microsoft stopped running macros in Office documents by default, when they realize the type of footgun it had become.

Of course talk is cheap, and I should probably try to help directly myself. Unfortunately I don’t have much experience with graphics formats, beside for maintaining unpaper, and that is not a particularly good result either: I tried using libav’s image loading, and it turns out it’s actually a mess. So I guess I should either invest my time in learning enough about building libraries for image processing, or poke around to see if someone wrote a good multi-format image processing library in, say, Rust.

Alternatively, if someone starts to work on this and want to have some help with either reviewing the code, or with integrating the final output in places where Ghostscript is used, I’m happy to volunteer my time. I’m fairly sure I can convince my manager to let me do some of that work as a 20% project.

Facebook, desktop apps, and photography

This is an interesting topic, particularly because I had not heard anything about it up to now, despite having many semi-pro and amateur photographer friends (I’m a wannabe). It appears that starting August 1st, Facebook will stop allowing desktop applications to upload photos to albums.

Since I have been uploading all of my Facebook albums through Lightroom, that’s quite a big deal for me. On Jeffrey Friedl’s website, there’s this note:

Warning: this plugin will likely cease to work as of August 1, 2018, because Facebook is revoking photo-upload privileges for all non-browser desktop apps like this.

As of June 2018, Adobe and I are in discussions with Facebook to see whether something might be worked out, but success is uncertain.

This is now less than a month before the deadline, and it appears there’s no update for this. Is it Facebook trying to convince people to just share all their photos as they were shot? Is it Adobe not paying attention trying to get people on their extremely-expensive Adobe CC Cloud products? (I have over 1TB of pictures shot, I can’t use their online service, it would cost me so much more in storage!) I don’t really know, but it clearly seems to be the case that my workflow is being deprecated.

Leaving aside the consideration of the impact of this on me alone, I would expect that most of the pro- and semi-pro-photographers would want to be able to upload their pictures without having to manually drag them with Facebook’s flaky interface. And it feels strange that Facebook wants to stop “owning” those photos altogether.

But there’s a bigger impact in my opinion, which should worry privacy-conscious users (as long as they don’t subscribe to the fantasy ideal of people giving up on sharing pictures): this moves erodes the strict access controls from picture publishing that defined social media up to now, for any of the users who have been relying on offline photo editing.

In my case, the vast majority of the pictures I take are actually landscapes, flowers, animals, or in general not private events. There’s the odd conference or con I bring my camera to (or should I say used to bring it to), or a birthday party or other celebration. Right now, I have been uploading all the non-people pictures as public (and copied to Flickr), and everything that involves people as friends-only (and only rarely uploaded to Flickr with “only me” access). Once the changes go into effect, I lose the ability to make simple access control decisions.

Indeed, if I was to upload the content to Flickr and use friends-only limited access, very few people would be able to see any of the pictures: Flickr has lost all of its pretension to be a social media platform once Yahoo stopped being relevant. And I doubt that the acquisition of SmugMug will change that part, as it would be just a matter of duplicating a social graph that Facebook already has. So I’m fairly sure a very common solution to that is going to be to make the photos public, and maybe the account not discoverable. After all who might be mining the Web for unlisted accounts of vulnerable people? (That’s sarcasm if it wasn’t clear.)

In my case it’s just going to be a matter of not bringing my camera to private events anymore. Not the end of the world, since I’m already not particularly good at portrait photography, and not my particular area of interest. But I do think that there’s going to be quite a bit of problems in the future.

And if you think this is not going to be a big deal at all, because most parties have pictures uploaded by people directly on their mobile phones… I disagree. Weddings, christenings, cons, sport matches, all these events usually have their share of professional photographers, and all these events need to have a way to share the output with not only the people who hired them, but also the friends of those, like the invitees at a wedding.

And I expect that for many professionals, it’s going to be a matter of finding a new service to upload the data to. Mark my words, as I expect we’ll find that there will be, in the future, leaks of wedding pictures used to dox notable people. And those will be due to insecure, or badly-secured, photo sharing websites, meant to replace Facebook after this change in terms.

Amazon, Project Gutenberg, and Italian Literature

This post starts in the strangest of places. The other night, my mother was complaining how few free Italian books are available on the Kindle Store.

Turns out, a friend of the family, who also has a Kindle, has been enjoying reading free English older books on hers. As my mother does not speak or read English, she’d been complaining that the same is not possible in Italian.

The books she’s referring to are older books, the copyright of which expired, and that are available on Project Gutenberg. Indeed, the selection of Italian books on that site is fairly limited, and it is something that I have indeed been sadden about before.

What has Project Gutenberg to do with Kindle? Well, Amazon appears to collect books from Project Gutenberg, convert them to Kindle’s native format, and “sell” them on the Kindle Store. I say “sell” because for the most part, these are available at $0.00, and are thus available for free.

While there is no reference to Project Gutenberg on their store pages, there’s usually a note on the book:

This book was converted from its physical edition to the digital format by a community of volunteers. You may find it for free on the web. Purchase of the Kindle edition includes wireless delivery.

Another important point is that (again, for the most part), the original language editions are also available! This is how I started reading Jules Verne’s Le Tour du monde en quatre-vingts jours while trying to brush up my French to workable levels.

Having these works available on the Kindle Store, free of both direct cost and delivery charge, is in my opinion a great step to distribute knowledge and culture. As my nephews (blood-related and otherwise) start reaching reading age, I’m sure that what I will give them as presents is going to be Kindle readers, because between having access to this wide range of free books, and the embedded touch-on dictionary, they feel like something I’d have thoroughly enjoyed using when I was a kid myself.

Unfortunately, this is not all roses. the Kindle Store still georestrict some books, so from my Kindle Store (which is set in the US), I cannot download Ludovico Ariosto’s Orlando Furioso in Italian (though I can download the translation for free, or buy for $0.99 a non-Project Gutenberg version of the original Italian text). And of course there is the problem of coverage for the various languages.

Italian, as I said, appears to be a pretty bad one when it comes to coverage. If I look at Luigi Pirandello’s books there are only seven entries, one of which is in English, and another one being a duplicate. Compare this with the actual list of his works and you can see that it’s very lacking. And since Pirandello died in 1936, his works are already in the public domain.

Since I have not actually being active with Project Gutenberg, I only have second hand knowledge of why this type of problem happens. One of the thing I remember having been told about this, is that most of the books you buy in Italian stores are either annotated editions, or updated for modern Italian, which causes their copyright to be extended do the death of the editor, annotator or translator.

This lack of access to Italian literature is a big bother, and quite a bit of a showstopper to giving a Kindle to my Italian “nephews”. I really wish I could find a way to fix the problem, whether it is by technical or political means.

On the political side, one could expect that, with the focus on culture of the previous Italian government, and the focus of the current government on the free-as-in-beer options, it would be easy to convince them to release all of the Italian literature that is in the public domain for free. Unfortunately, I wouldn’t even know where to start to ask them to do that.

On the technical side, maybe it is well due time that I spend a significant amount of time on my now seven years old project of extracting a copy of the data from the data files of Zanichelli’s Italian literature software (likely developed at least in part with public funds).

The software was developed for Windows 3.1 and can’t be run on any modern computer. I should probably send the ISOs of it to the Internet Archive, and they may be able to keep it running there on DosBox with a real copy of Windows 3.1, since Wine appears to not support the 16-bit OLE interfaces that the software depends on.

If you wonder what would be a neat thing for Microsoft to release as open-source, I would probably suggest the whole Windows 3.1 source code would be a starting point. If nothing else, with the right license it would be possible to replace the half-complete 16-bit DLLs of Wine with official, or nearly-official copies.

I guess it’s time to learn more about Windows 3.1 in my “copious spare time” (h/t Charles Stross), and start digging into this. Maybe Ryan’s 2ine might help, as OS/2 and Windows 3.1 are closer than the latter is to modern Windows.

We need Free Software Co-operatives, but we probably won’t get any

The recent GitHub craze that got a number of Free Software fundamentalists to hurry away from GitHub towards other hosting solutions.

Whether it was GitLab (a fairly natural choice given the nature of the two services), BitBucket, or SourceForge (which is trying to rebuild a reputation as a Free Software friendly hosting company), there are a number of options of new SaaS providers.

At the same time, a number of projects have been boasting (and maybe a bit too smugly, in my opinion) that they self-host their own GitLab or similar software, and suggested other projects to do the same to be “really free”.

A lot of the discourse appears to be missing nuance on the compromises that using SaaS hosting providers, self-hosting for communities and self-hosting for single projects, and so I thought I would gather my thoughts around this in one single post.

First of all, you probably remember my thoughts on self-hosting in general. Any solution that involves self-hosting will require a significant amount of ongoing work. You need to make sure your services keep working, and keep safe and secure. Particularly for FLOSS source code hosting, it’s of primary importance that the integrity and safety of the source code is maintained.

As I already said in the previous post, this style of hosting works well for projects that have a community, in which one or more dedicated people can look after the services. And in particular for bigger communities, such as KDE, GNOME, FreeDesktop, and so on, this is a very effective way to keep stewardship of code and community.

But for one-person projects, such as unpaper or glucometerutils, self-hosting would be quite bad. Even for xine with a single person maintaining just site+bugzilla it got fairly bad. I’m trying to convince the remaining active maintainers to migrate this to VideoLAN, which is now probably the biggest Free Software multimedia project and community.

This is not a new problem. Indeed, before people rushed in to GitHub (or Gitorious), they rushed in to other services that provided similar integrated environments. When I became a FLOSS developer, the biggest of them was SourceForge — which, as I noted earlier, was recently bought by a company trying to rebuild its reputation after a significant loss of trust. These environments don’t only include SCM services, but also issue (bug) trackers, contact email and so on so forth.

Using one of these services is always a compromise: not only they require an account on each service to be able to interact with them, but they also have a level of lock-in, simply because of the nature of URLs. Indeed, as I wrote last year, just going through my old blog posts to identify those referencing dead links had reminded me of just how many project hosting services shut down, sometimes dragging along (Berlios) and sometimes abruptly (RubyForge).

This is a problem that does not only involve services provided by for-profit companies. Sunsite, RubyForge and Berlios didn’t really have companies behind, and that last one is probably one of the closest things to a Free Software co-operative that I’ve seen outside of FSF and friends.

There is of course Savannah, FSF’s own Forge-lookalike system. Unfortunately for one reason or another it has always lagged behind the featureset (particularly around security) of other project management SaaS. My personal guess is that it is due to the political nature of hosting any project over on FSF’s infrastructure, even outside of the GNU project.

So what we need would be a politically-neutral, project-agnostic hosting platform that is a co-operative effort. Unfortunately, I don’t see that happening any time soon. The main problem is that project hosting is expensive, whether you use dedicated servers or cloud providers. And it takes full time people to work as system administrators to keep it running smoothly and security. You need professionals, too — or you may end up like lkml.org being down when its one maintainer goes on vacation and something happens.

While there are projects that receive enough donations that they would be able to sustain these costs (see KDE, GNOME, VideoLAN), I’d be skeptical that there would be an unfocused co-operative that would be able to take care of this. Particularly if it does not restrict creation of new projects and repositories, as that requires particular attention to abuse, and to make good guidelines of which content is welcome and which one isn’t.

If you think that that’s an easy task, consider that even SourceForge, with their review process, that used to take a significant amount of time, managed to let joke projects use their service and run on their credentials.

A few years ago, I would have said that SFLC, SFC and SPI would be the right actors to set up something like this. Nowadays? Given their infights I don’t expect them being any useful.

Two words about my personal policy on GitHub

I was not planning on posting on the blog until next week, trying to stick on a weekly schedule, but today’s announcement of Microsoft acquiring GitHub is forcing my hand a bit.

So, Microsoft is acquiring GitHub, and a number of Open Source developers are losing their mind, in all possible ways. A significant proportion of comments on this that I have seen on my social media is sounding doomsday, as if this spells the end of GitHub, because Microsoft is going to ruin it all for them.

Myself, I think that if it spells the end of anything, is the end of the one-stop-shop to work on any project out there, not because of anything Microsoft did or is going to do, but because a number of developers are now leaving the platform in protest (protest of what? One company buying another?)

Most likely, it’ll be the fundamentalists that will drop their projects away to GitHub. And depending on what they decide to do with their projects, it might even not show on anybody’s radar. A lot of people are pushing for GitLab, which is both an open-core self-hosted platform, and a PaaS offering.

That is not bad. Self-hosted GitLab instances already exist for VideoLAN and GNOME. Big, strong communities are in my opinion in the perfect position to dedicate people to support core infrastructure to make open source software development easier. In particular because it’s easier for a community of dozens, if not hundreds of people, to find dedicated people to work on it. For one-person projects, that’s overhead, distracting, and destructive as well, as fragmenting into micro-instances will cause pain to fork projects — and at the same time, allowing any user who just registered to fork the code in any instance is prone to abuse and a recipe for disaster…

But this is all going to be a topic for another time. Let me try to go back to my personal opinions on the matter (to be perfectly clear that these are not the opinions of my employer and yadda yadda).

As of today, what we know is that Microsoft acquired GitHub, and they are putting Nat Friedman of Xamarin fame (the company that stood behind the Mono project after Novell) in charge of it. This choice makes me particularly optimistic about the future, because Nat’s a good guy and I have the utmost respect for him.

This means I have no intention to move any of my public repositories away from GitHub, except if doing so would bring a substantial advantage. For instance, if there was a strong community built around medical devices software, I would consider moving glucometerutils. But this is not the case right now.

And because I still root most of my projects around my own domain, if I did move that, the canonical URL would still be valid. This is a scheme I devised after getting tired of fixing up where unieject ended up with.

Microsoft has not done anything wrong with GitHub yet. I will give them the benefit of the doubt, and not rush out of the door. It would and will be different if they were to change their policies.

Rob’s point is valid, and it would be a disgrace if various governments would push Microsoft to a corner requiring it to purge content that the smaller, independent GitHub would have left alone. But unless that happens, we’re debating hypothetical at the same level of “If I was elected supreme leader of Italy”.

So, as of today, 2018-06-04, I have no intention of moving any of my repositories to other services. I’ll also use a link to this blog with no accompanying comment to anyone who will suggest I should do so without any benefit for my projects.