Don’t Ignore Windows 10 as a Development Platform for FLOSS

Important Preface: This blog post was written originally on 2020-05-12, and scheduled for later publication, inspired by this short Twitter thread. As such it well predates Microsoft’s announcement of expanding support of WSL2 to graphical apps. I considered trashing, or seriously re-editing the blog post in the light of the announcement, but I honestly lack the energy to do that now. It left a bad taste in my mouth to know that it will likely get drowned out in the noise of the new WSL2 features announcement.

Given the topic of this post I guess I need to add a preface to point out my “FLOSS creds” — because I have seen already too many attacks to people who even use Windows at all. I have been an opensource developer for over fifteen years now, and part of the reason why I left my last bubble was because it made it difficult for me to contribute to various opensource projects. I say this because I’m clearly a supporter of Free Software and Open Source, wherever possible. I also think that’s different people have different needs, and that ignoring that is a failure of the FLOSS movement as a whole.

The “Year of Linux on the Desktop” is now a meme that has been running its course to the point of being annoying. Despite what FLOSS advocates keep saying, “Linux on the Desktop” is not really moving, and while I do have some strong opinions on this, that’s for another day. Most users, and in particular newcomers to FLOSS (both as users and developers) are probably using a more “user friendly” platform — if you leave a comment with the joke on UNIX being selective with its friends, you’ll end up on a plonkfile, be warned.

About ten years ago, it seemed like the trend was for FLOSS developers to use MacBooks as their daily laptops. I did that for a while myself — an UNIX-based platform with all the tools of the trade, which allowed quite a bit of work being done without having access to a Linux platform. SSH, Emacs, GCC, Ruby, and so on. And at the same time, you had the stability of Mac OS X, with the battery life and all the hardware worked great out of the box. But then more recently, Apple’s move towards “walled gardens” seemed to be taking away from this feasibility.

But back to the main topic. Over the past many years, I’ve been using a “mixed setup” — using a Linux laptop (or more recently desktop) for development, and a Windows (7, then 10) desktop for playing games, editing photos, designing PCBs, and for logic analysis. The latter is because Saleae Logic takes a significant amount of RAM when analysing high-frequency signals, and I have been giving my gamestations as much RAM as I can just for Lightroom, so it makes sense to run it on the machine with 128GB of RAM.

But more recently I have been exploring the ability of using Windows 10 as a development platform. In part because my wife has been learning Python, and since also learning a new operating system and paradigm at the same time would have been a bloody mess, she’s doing so on Windows 10 using Visual Studio Code and Python 3 as distributed through the Microsoft Store. While helping her, I had exposure to Windows as a Python development platform, so I gave it a try when working on my hack to rename PDF files, which turned out to be quite okay for a relatively simple workflow. And the work on the Python extension keeps making it more and more interesting — I’m not afraid to say that Visual Studio Code is better integrated with Python than Emacs, and I’m a long-time user of Emacs!

In the last week I have actually stepped up further how much development I’m doing on Windows 10 itself. I have been using HyperV virtual machines for Ghidra, to make use of the bigger screen (although admittedly I’m just using RDP to connect to the VM so it doesn’t really matter that much where it’s running), and in my last dive into the Libre 2 code I felt the need to have a fast and responsive editor to go through executing part of the disassembled code to figure out what it’s trying to do — so once again, Visual Studio Code to the rescue.

Indeed, Windows 10 now comes with an SSH client, and Visual Studio Code integrates very well with it, which meant I could just edit the files saved in the virtual machine and have the IDE also build them with GCC and executing them to get myself an answer.

Then while I was trying to use packetdiag to prepare some diagrams (for a future post on the Libre 2 again), I found myself wondering how to share files between computers (to use the bigger screen for drawing)… until I realised I could just install the Python module on Windows, and do all the work there. Except for needing sed to remove an incorrect field generated in the SVG. At which point I just opened my Debian shell running in WSL, and edited the files without having to share them with anything. Uh, score?

So I have been wondering, what’s really stopping me from giving up my Linux workstation for most of the time? Well, there’s hardware access — glucometerutils wouldn’t really work on WSL unless Microsoft is planning a significant amount of compatibility interfaces to be integrated. Similar for using hardware SSH tokens — despite PC/SC being a Windows technology to begin with. Screen and tabulated shells are definitely easier to run on Linux right now, but I’ve seen tweets about modern terminals being developed by Microsoft and even released FLOSS!

Ironically, I think it’s editing this blog that is the most miserable experience for me on Windows. And not just because of the different keyboard (as I share the gamestation with my wife, the keyboard is physically a UK keyboard — even though I type US International), but also because I miss my compose key. You may have noticed already that this post is full of em-dashes and en-dashes. Yes, I have been told about WinCompose, but last time I tried using it, it didn’t work and even screwed up my keyboard altogether. I’m now trying it again, at least on one of my computers, and if it doesn’t explode in my face again, I may just give it another try later.

And of course it’s probably still not as easy to set up a build environment for things like unpaper (although at that point, you can definitely run it in WSL!), or to have a development environment for actual Windows applications. But this is all a matter of different set of compromises.

Honestly speaking, it’s very possible that I could survive with a Windows 10 laptop for my on-the-go opensource work, rather than the Linux one I’ve been using. With the added benefit of being able to play Settlers 3 without having to jump through all the hoops from the last time I tried. Which is why I decided that the pandemic lockdown is the perfect time to try this out, as I barely use my Linux laptop anyway, since I have a working Linux workstation all the time. I have indeed reinstalled my Dell XPS 9360 with Windows 10 Pro, and installed both a whole set of development tools (Visual Studio Code, Mu Editor, Git, …) and a bunch of “simple” games (Settlers, Caesar 3, Pharaoh, Age of Empires II HD); Discord ended up in the middle of both, since it’s actually what I use to interact with the Adafruit folks.

This doesn’t mean I’ll give up on Linux as an operating system — but I’m a strong supporter of “software biodiversity”, so the same way I try to keep my software working on FreeBSD, I don’t see why it shouldn’t work on Windows. And in particular, I always found that providing FLOSS software on Windows a great way to introduce new users to the concept of FLOSS — focusing more on providing FLOSS development tools means giving an even bigger chance for people to build more FLOSS tools.

So is everything ready and working fine? Far from it. There’s a lot of rough edges that I found myself, which is why I’m experimenting with developing more on Windows 10, to see what can be improved. For instance, I know that the reuse-tool has some rough edges with encoding of input arguments, since PowerShell appears to still not default to UTF-8. And I failed to use pre-commit for one of my projects — although I have not taken notice yet much of what failed, to start fixing it.

Another rough edge is in documentation. Too much of it assumes only a UNIX environment, and a lot of it, if it has any support for Windows documentation at all, assumes “old school” batch files are in use (for instance for Python virtualenv support), rather than the more modern PowerShell. This is not new — a lot of times modern documentation is only valid on bash, and if you were to use an older operating system such as Solaris you would find yourself lost with the tcsh differences. You can probably see similar concerns back in the days when bash was not standard, and maybe we’ll have to go back to that kind of deal. Or maybe we’ll end up with some “standardization” of documentation that can be translated between different shells. Who knows.

But to wrap this up, I want to give a heads’ up to all my fellow FLOSS developers that Windows 10 shouldn’t be underestimated as a development platform. And that if they intend to be widely open to contributions, they should probably give a thought of how their code works on Windows. I know I’ll have to keep this in mind for my future.

On Rake Collections and Software Engineering

autum, earth's back scratcher

Matthew posted on twitter a metaphor about rakes and software engineering – well, software development but at this point I would argue anyone arguing over these distinctions have nothing better to do, for good or bad – and I ran with it a bit by pointing out that in my previous bubble, I should have used “Rake Collector” as my job title.

Let me give a bit more context on this one. My understanding of Matthew’s metaphor is that senior developers (or senior software engineers, or senior systems engineers, and so on) are at the same time complaining that their coworkers are making mistakes (“stepping onto rakes”, also sometimes phrased as “stepping into traps”), while at the same time making their environment harder to navigate (“spreading more rakes”, also “setting up traps”).

This is not a new concept. Ex-colleague Tanya Reilly expressed a very similar idea with her “Traps and Cookies” talk:

I’m not going to repeat all of the examples of traps that Tanya has in her talk, which I thoroughly recommend for people working with computers to watch — not only developers, system administrators, or engineers. Anyone working with a computer.

Probably not even just people working with computers — Adam Savage expresses yet another similar concept in his Every Tool’s a Hammer under Sweep Up Every Day:

[…] we bought a real tree for Christmas every year […]. My job was always to put the lights on. […] I’d open the box of decorations labeled LIGHTS from the previous year and be met with an impossible tangle of twisted, knotted cords and bulbs and plugs. […] You don’t want to take the hour it’ll require to separate everything, but you know it has to be done. […]

Then one year, […] I happened to have an empty mailing tube nearby and it gave me an idea. I grabbed the end of the lights at the top of the tree, held them to the tube, then I walked around the tree over and over, turning the tube and wrapping the lights around it like a yuletide barber’s pole, until the entire six-string light snake was coiled perfectly and ready to be put back in its appointed decorations box. Then, I forgot all about it.

A year later, with the arrival of another Christmas, I pulled out all the decorations as usual, and when I opened the box of lights, I was met with the greatest surprise a tired working parent could ever wish for around the holidays: ORGANIZATION. There was my mailing tube light solution from the previous year, wrapped up neat and ready to unspool.

Adam Savage, Every Tool’s a Hammer, page 279, Sweep up every day

This is pretty much the definition of Tanya’s cookie for the future. And I have a feeling that if Adam was made aware of Tanya’s Trap concept, he would probably point at a bunch of tools with similar concepts. Actually, I have a feeling I might have heard him saying something about throwing out a tool that had some property that was opposite of what everything else in the shop did, making it dangerous. I might be wrong so don’t quote me on that, I tried looking for a quote from him on that and failed to find anything. But it is something I definitely would do among my tools.

So what about the rake collection? Well, one of the things that I’m most proud of in my seven years at that bubble, is the work I’ve done trying to reduce complexity. This took many different forms, but the main one has been removing multiple optional arguments to interfaces of libraries that would be used across the whole (language-filtered) codebase. Since I can’t give very close details of what’s that about, you’ll find the example a bit contrived, but please bear with me.

When you write libraries that are used by many, many users, and you decide that you need a new feature (or that an old feature need to be removed), you’re probably going to add a parameter to toggle the feature, and either expect the “modern” users to set it, or if you can, you do a sweep over the current users, to have them explicitly request the current behaviour, and then you change the default.

The problem with all of this, is that cleaning up after these parameters is often seen as not worth it. You changed the default, why would you care about the legacy users? Or you documented that all the new users should set the parameter to True, that should be enough, no?

That is a rake. And one that is left very much in the middle of the office floor by senior managers all the time. I have seen this particular pattern play out dozens, possibly hundreds of times, and not just at my previous job. The fact that the option is there to begin with is already increasing complexity on the library itself – and sometimes that complexity gets to be very expensive for the already over-stretched maintainers – but it’s also going to make life hard for the maintainers of the consumers of the library.

“Why does the documentation says this needs to be True? In this code my team uses it’s set to False and it works fine.” “Oh this is an optional parameter, I guess I can ignore it, since it already has a default.” *Copy-pastes from a legacy tool that is using the old code-path and nobody wanted to fix.*

As a newcomer to an environment (not just a codebase), it’s easy to step on those rakes (sometimes uttering exactly the words above), and not knowing it until it’s too late. For instance if a parameter controls whether you use a more secure interface, over an old one you don’t expect new users of. When you become more acquainted with the environment, the rakes become easier and easier to spot — and my impression is that for many newcomers, that “rake detection” is the kind of magic that puts them in awe of the senior folks.

But rake collection means going a bit further. If you can detect the rake, you can pick it up, and avoid it smashing in the face of the next person who doesn’t have that detection ability. This will likely slow you down, but an environment full of rakes slows down all the newcomers, while a mostly rake-free environment would be much more pleasant to work with. Unfortunately, that’s not something that aligns with business requirements, or with the incentives provided by management.

A slight aside here. Also on Twitter, I have seen threads going by about the fact that game development tends to be a time-to-market challenge, that leaves all the hacks around because that’s all you care about. I can assure you that the same is true for some non-game development too. Which is why “technical debt” feels like it’s rarely tackled (also on the note, Caskey Dickson has a good technical debt talk). This is the main reason why I’m talking about environments rather than codebases. My experience is with long-lived software, and libraries that existed for twice as long as I worked at my former employer, so my main environment was codebases, but that is far from the end of it.

So how do you balance the rake-collection with the velocity of needing to get work done? I don’t have a really good answer — my balancing results have been different team by team, and they often have been related to my personal sense of achievement outside of the balancing act itself. But I can at least give an idea of what I do about this.

I described this to my former colleagues as a rule of thumb of “three times” — to keep with the rake analogy, we can call it “three notches”. When I found something that annoyed me (inconsistent documentation, required parameters that made no sense, legacy options that should never be used, and so on), I would try to remember it, rather than going out of my way to fix it. The second time, I might flag it down somehow (e.g. by adding a more explicit deprecation notice, logging a warning if the legacy codepath is executed, etc.) And the third time I would just add it to my TODO list and start addressing the problem at the source, whether it would be within my remit or not.

This does not mean that it’s an universal solution. It worked for me, most of the time. Sometimes I got scolded for having spent too much time on something that had little to no bearing on my team, sometimes I got celebrated for unblocking people who have been fighting with legacy features for months if not years. I do think that it was always worth my time, though.

Unfortunately, rake-collection is rarely incentivised. The time spent cleaning up after the rakes left in the middle of the floor eats into one’s own project time, if it’s not the explicit goal of their role. And the fact that newcomers don’t step into those rakes and hurt themselves (or slow down, afraid of bumping into yet another rake) is rarely quantifiable, for managers to be made to agree to it.

What could he tell them? That twenty thousand people got bloody furious? That you could hear the arteries clanging shut all across the city? And that then they went back and took it out on their secretaries or traffic wardens or whatever, and they took it out on other people? In all kinds of vindictive little ways which, and here was the good bit, they thought up themselves. For the rest of the day. The pass-along effects were incalculable. Thousands and thousands of soul all got a faint patina of tarnish, and you hardly had to lift a finger.

But you couldn’t tell that to demons like Hastur and Ligur. Fourteenth-century minds, the lot of them. Spending years picking away at one soul. Admittedly it was craftsmanship, but you had to think differently these days. Not big, but wide. With five billion people in the world you couldn’t pick the buggers off one by one any more; you had to spread your effort. They’d never have thought up Welsh-language television, for example. Or value-added tax. Or Manchester.

Good Omens page 18.

Honestly, I often felt like Crowley: I rarely ever worked on huge, top-to-bottom cathedral projects. But I would be sweeping around a bunch of rakes, so that newcomers wouldn’t hit them, and that all of my colleagues would be able to build stuff more quickly.

Have you seen some gold?

Since I have in my TODO list to work on two binutils problems (the warning on softer —as-needed and the fix for PulseAudio build), I also started wondering why I haven’t heard, or rather read, anything about the gold linker .

Saying that I’m disappointed does not really cover much of it to be honest, since I don’t really wish to switch to a linker written in C++ any time soon. But I really hoped that it would generate enough momentum to find a solution. Because, yes, the ld linker that ships with binutils is tremendously slow to link C++ code, and as Linkers & Loaders let me understand now, the problem is not just the length of the (mangled) symbol names, but also the way that templates are expanded and linked together.

But still, I think it’s really worth investigating some alternative, which in my opinion needs not to be written in C++, with all the problems related to that. Saying that the gold linker is fast just because of the language it is written is absolutely naïve, since the problems lie quite deeper than that.

The main problem is that the current ld implementation is based, like the rest of the binutils tools, upon libbfd, an abstraction that allows to support multiple binary formats, not just ELF. It basically allows to use mostly the same interface on different operating systems with different executable formats: ELF under Linux, BSD and Solaris, Mach-O under Mac OS X and PE under Windows and more. While this allows to get a much more powerful ld command, it’s actually a bit of a bottleneck.

Even though the thing is designed well enough for not crumble easily, it is probably a good area to investigate to find why it’s so slow. Having an alternative, ELF-only linker available for users, Gentoo users especially, would likely be a good test. This would follow the same thing that Apple does on OSX (GCC calls Apple’s linker) as well as Sun under Solaris with their copy of GCC.

While I’m all for generic code, sometimes you need to have specialised tools if you want to access advanced features of files, or if you want to have a fast, optimised software.

The same thing can be said for the analysis tool provided by binutils, as I’ve written in my post about elfutils the nm, readelf and objdump tools as provided by binutils, to be generic, lack some of the useful defaults and different interface that elfutils have. Which goes to show why specialised tools here could help. I know that FreeBSD was working on providing replacement for these tools, under the BSD license as their usual. While that’s certainly an important step, I don’t remember reading anything about a new linker.

As it is, I haven’t gone out of my way to see if there are already some alternative linkers that work under Linux, beside the one provided by Sun’s compiler in Sun Studio Express (which has lots of problems on its own). If there is already one we should look at how it stands for what concerns features.

What we desire from a specialised linker, beside speed, is proper support for .gnu.hash section, --as-needed-like features, no text relocation emitted in the code (which is a problem gold used to have at least), and possibly a better support to garbage collection of unused sections that could allow using it in production code without huge impact on performance as it seems to happen with -fdata-sections and -ffunction-sections.

I’m not going to work on this, but if somebody is interested in my opinion about using, in Gentoo, any linker in particular I’d be glad to look at them, not going to spare words though, so that you know.

GCC features and shortcomings

As any other Free Software developer, I think I have a love/hate relationship with the GNU Compiler Collection or, as it’s most commonly called, GCC. While GCC is a very good modern compiler, with tons of features, warning heuristics and very good optimisations, it’s very slow, and it’s not exactly foolproof when it comes to warnings.

In particular, I already wrote a couple of time that I dislike very much the way GCC cannot identify value sets but never used even though it should be trivial to do during SSA (and indeed the variables are not emitted in the final code), but it’s not just that.

Starting from version 4.3, GCC added a new support for warnings, the -Werror= option; with earlier versions you could turn every warning into an error with -Werror, or you had a couple of cases for -Werror-implicit-declaration for instance. With -Werror= you can set a specific class of warnings as being errors, and not turn the rest into errors. This is very good since some warnings like -Wreturn-types are very truly errors, rather than just warnings, and it’s indeed a good idea to stop the build if they are raised.

So -Werror= is good, and I started using it in more than a couple of my projects, to make sure that code is not injected that could break something. On the other hand, -Werror= requires that you know the name of the -W flag that enables a given warning. It’s very easy to do by using -fdiagnostics-show-option:

% gcc -x c -Wall -Wextra -fdiagnostics-show-option -c -o /dev/null - < In function ‘foo':
:2: warning: control reaches end of non-void function [-Wreturn-type]

Cool, now we know that to turn that warning into error we have to pass -Werror=return-type. Now this should be enough to turn any warning into errors, you’d think, but it’s unfortunately not the case. Take for instance the following case:

% gcc -x c -fdiagnostics-show-option -o /dev/null -c - <: In function ‘foo':
:2: warning: return makes integer from pointer without a cast

You can notice here two particular things, the first is the obvious one, that gcc is not reporting a warning option near the warning itself, which disallows us to use -Werror= and the other that I didn’t pass any -W flag to enable the warning. This is one of the warnings that are considered most important to gcc, so important that they are always enabled even when developers and users don’t go around asking for them, so important that the only way to disable it is to pass the -w flag to disable all warnings, as there is no -Wno- flag that disables them. But for these very same reasons, it is not possible to turn them into errors!

As you might guess is a paradoxical situation: the most important, most useful warnings (that most likely mean trouble) cannot be turned into errors because there is no way to disable them. And yet, there seems to be no development on this side, at least judging from the bug I reported .

Sometimes I find GCC is funny…

Unit testing frameworks

For a series of reason, I’m interested in writing unit tests for a few project, one of which will probably be my 1.3 branch of xine-lib (Yes I know 1.2 hasn’t been released yet in beta form either).

While unit tests can be written quite easily without much framework, having at least a basic framework would help to make automated testing possible. Since the project I have to deal with use standard C, I started looking at some basic unit test frameworks.

Unfortunately, it seems like both CUnit and check have last seen a release in 2006, and their respective repositories seem quite calm. In case of CUnit I also have noticed a quite broken buildsystem.

Glib seems to have some basic support for test units, but even they don’t use it so I doubt it’d be a nice choice. There are quite a few unit testing frameworks for particular environments, like Qt or Gnome, but I haven’t found anything generic.

It seems funky that even if people always seems to cheer test-driven development there isn’t a good enough framework for C. I think Ruby, Java, Perl and Python have already their well established frameworks, and most of the software use that, but there is neither a standard nor a widely accepted framework for C.

I could probably write my own framework but that’s not really an option, I don’t have so much free time to my hands, I suppose the less effort would be to contribute to one of the frameworks already out there, so that I can fix whatever I need fixed and have it working as I need. Unfortunately, I’d have to start looking at all of them and find the less problematic before I start doing this, and it is not useful if the original authors have gone MIA or similar, especially since at least CUnit was still developed using CVS.

If somebody have suggestions on how to proceed, they are most welcome. I could probably fork one of them if I have to, although I dislike the idea. Out of what I gathered quite briefly, the presence of XML generation for results in CUnit might be useful to gather up tests’ statistics for an automatic testing facility.

Porting to Glib

In the past few days I’ve been working to port part of lscube to Glib. The idea is that instead of inventing lists, ways to iterate over them, to find data and so on so forth, using Glib versions should be much easier and also faster as they tend to be optimised.

Interestingly enough, glib does seem to provide almost every feature out there, although I still think the logging facilities lack something.

It is very interesting to finally being able to ditch custom-made data structures for things that were tested already before. Unfortunately, rewriting big parts of the code is, as usual, vulnerable to human mistakes. But it’s fun.

I start to wonder how much duplication of efforts we have in a system, I’m pretty sure there is some at least. xine-lib for instance does not use Glib, and it then re-implements some structures and abstractions that could very well be used from Glib. I’m tempted, once I feel better and I can pay more attention to xine-lib again, to start a new branch and see how it would work to move xine-lib to Glib. Considering that even Qt4 now brings that in, I guess most of the frontends would be already using Glib, so it makes sense.

Actually some of the plugins of xine itself make use of Glib, so it would probably be a good idea to at least try to reduce the duplication by using Glib there too. Things like Glib’s GString would make reply generation for HTTP, RTP/RTSP and MMS quite easier than the current methods, and probably much safer. I would even expect it to be faster but I won’t swear on that just yet.

So this is one more TODO for xine-lib, at this point I guess 1.3 series; it would also be nice to start using Ragel for demuxers and protocol parsers.

I guess I should be playing by now rather than writing blogs about technical stuff or rewriting code to use Glib or ragel, sigh.

On patching and security issues

Jeff, I think your concerns are pretty well real. The problem here though is not that Debian users should not be suggested not to file bugs upstream, the problem is that Debian should not go out of their way to patch stuff around.

Of course this is not entirely Debian’s fault, there are a few projects for which dealing with upstream is a tremendous waste of time of cosmic proportions, as they ignore distributors, think that their needs are totally bogus and stuff like that. Now, not all projects are like that of course. Projects like Amarok are quite friendly with downstream (to the point all the patches that are in Gentoo, added by me at least, were committed at the same time on the SVN), and most of the projects that you can find not suiting any distribution are most likely not knowing what the distributors need.

I did write about this in the past, and you can find my ideas on the “Distribution-friendly Projects” article, published on LWN (part 1, part 2 and part 3). I do suggest the read of that to anybody who has an upstream project, and would like to know what distributors need.

But the problem here is that Debian is known for patching the blood out of a project to adapt it to their needs. Sometimes this is good, as they take a totally distribution-unfriendly package into a decent one, sometimes it’s very bad.

You can find a few good uses of Debian’s patches in Portage, it’s not uncommon for a patched project to be used. On the other hand, you can think of at least two failures that, at least for me, shown the way Debian can easily fail:

  • a not-so-commonly known failure in autotoolising metamail, a dependency of hylafax that I tried to run on FreeBSD before. They did use autoconf and automake, but they made them so that they only work under Linux, proving they don’t know autotools that well;
  • the EPIC FAIL of the OpenSSL security bug; where people wanted to fix a problem with Valgrind, not knowing valgrind (if you have ever looked at valgrind docs, there is a good reference about suppression files, rather than patching code you don’t understand either).

Now this of course means nothing, of course even in Gentoo there has been good patches and bad patches; I have yet to see an EPIC FAIL like the OpenSSL debacle, but you never know.

The problem lies in the fact that Debian also seem to keep an “holier than thou” attitude toward any kind of collaboration, as you can easily notice in Marco d’Itri’s statements regarding udev rules (see this LWN article). I know a few Debian developers who are really nice guys whom I love to work with (like Reinhard Tartler who packages xine and Russel Coker whose blog I love to follow, for both technical posts and “green” posts; but not limited to), but for other Debian developers to behave like d’Itri is far from unheard of, and actually not uncommon either.

I’m afraid that the good in Debian is being contaminated by people like these, and by the attitude of trusting no one but themselves in every issue. And I’m sorry to see that because Debian was my distribution of choice when I started using Linux seriously.

What did Enterprise do?

Now that enterprise died (or at least is pretty much sick), I am pointing toward a high-end system. I can understand it is difficult to accept that I don’t just get the cheapest box I can find at the local store, and be done with it.

Why is this? Well the first problem is that in Italy, prices are something very strange. It’s not unexpected for me to find components at half the price, or less, when looking them up in other European shops. In particular, in the local shops a good enough PSU rated 450W like the one I had before would cost me €140. Consider I paid mine €100 two years ago. I could get it from Germany paying less for it, included shipping, that I would get it from Italy, but, I’m not sure it’s the PSU itself, I don’t count on it. Why? Because there is a burning plastic smell when Enterprise is on, and it does not come from the PSU.

So rather than getting a new PSU, waiting to see if it’s the motherboard, or the CPU, or the memory, and then get one of those at a time, paying multiple time the shipment, I’m keen on replacing the box entirely. I was actually already planning on the update, the problem here is the timing: if it wasn’t happening while my health is in this status, I would have had enough availability to just replace Enterprise straight away.

But why am I spending €1300 on a system rather than spending, say, $600 to get the cheapest Intel quadcore available? First is, I don’t think I can get much for such a price, US prices are quite lower, even considering taxes, than the prices in Italy. I checked out newegg before, and the prices were almost half the prices on European shops, so it means a quarter of the prices of Italian suppliers. Unfortunately they don’t ship overseas. Of course I could just get it sent to me through some loophole, but again: getting it through customs would cost me between 40-50% of the nominal price shipment costs included, and it’d be impossible to get warranty out of it. And it’s not very good for most consumer-grade hardware, not having warranty.

On the other hand, a cheap Intel quadcore with a decent amount of memory could work well as a workstaiton, the problem is that Enterprise has never been your usual workstation.

Enterprise not only worked as my workstation, and used to be my media center, but most of all, it’s a development box. I’m not just rebuilding projects I work on, but I’m rebuilding many times the whole of portage. When I updated first to GCC 4.3, the first thing I did was rebuilding world; when glibc 2.8 was released, I rebuilt world; when a new autoconf or automake version is released, I rebuild world. Why? Because I can usually fix or at least give a good indication how to fix those problems.

The faster these rebuild are, the faster I can fix the problem, the faster they enter portage, usually. But it’s not just that.

For instance, Enterprise had a massively more aggressive --as-needed support: I force it through GCC specs. The result is that it stressed out linking, working around libtool brokenness and similar issues.

But this could warrant a multicore system, why going high end? Well, together with the standard system in /, Enterprise had a series of chroots, one handles the updates for the vserver where my blog is (but also xine’s Bugzilla, which is something useful for F/OSS, not just me), others handle corner-cases tests. Those are the ones building for instance a system with OpenPAM instead of Linux-PAM to see which parts of portage can work with it. Or testing cowstats with PIE enabled, to find programs that relay on the fact that they don’t need data relocations outside shared libraries.

It’s kinda like a tinderbox but it isn’t a tinderbox. It was a system that was almost never idling.

And, one thing I haven’t done, or improved in a few months, to be honest, is working on the linking collisions detection. The reason why I stopped doing that is that even using postgresql it takes a long time. And it wasn’t specifically testing for possibly embedded libraries yet.

While I do like devoting my time to Free Software development, a faster box means I can make better use of my time, which, considering my health problems, is probably a good thing (doing the same stuff in less time means I have more time to spend on other things, like going in and out of hospitals, or relaxing if I don’t feel good enough). Maybe I’m selfish but I’d rather spend money on a fast system with users’ help, than spending little money on a cheap system, and being forced to work less on Free Software so that I can handle hospitals and relax time.

So, thanks to all the users helping me with this, I’m doing my best to try securing the money for ordering the box ASAP so I can let it resume its tasks while I’m in the hospital too. And as soon as my health stops the downslide, I’ll be working on Free Software again.

Distributions and interactions

If you have read the article I wrote about distribution-friendly projects at LWN (part 1 and part 2, part 3 is still to be published at the time I write this), I tried to list some of the problems that distributions face when working with upstream projects.

One interesting thing that I did oversee, and I think was worth an article on its own, or even a book on its own, finding the time, is how to handle interaction between projects. All the power-users expect that the developers don’t try to reinvent the wheel every time, or if they do they do it for a bloody good reason. Unfortunately this is far from true, but beside this fact, when sharing code between projects it’s quite common that mistakes end up propagating in quite a huge area.

A very good example of this is the current state of Gentoo’s ~arch tree. At the moment there are quite a few things that might lead to build-failure because different projects are interleaved:

  • GCC 4.3 changes will cause a lot of build failures in C++ software because the headers dependencies were cleaned up; similar changes happen every GCC minor release;
  • autoconf 2.62 started catching bad errors in configure scripts, especially related to variable names; similar changes happen every autoconf release;
  • libtool 2.2 stopped calling C++ and Fortran compilers check by default, so packages have to take care of what they do use;
  • curl 7.18 is now stricter in the way options are set in its easy interface, requiring boolean options to be explicitly provided;
  • Qt 4.4 is being split in multiple packages, and dependencies has thus to be fixed.

There are probably more problems thatn these, but these are probably the main ones. Unfortunately the solution to similar problems by a few projects is not to start a better symbiotic relationship between the various project, is to require the user to use a given version for their dependencies… which might be different from the version that the user is told to use for another project… or even worse, they import a copy of the library they use and use that.

Interestingly enough, it seems expat is really a good example of library imported all over, and less understandable than zlib (which is quite small on its own, although it has its share of security issues built-in). I’ve found a few before, and some of them are now fixed in the tree, but in the last two days I found at least four more. Two are in Python itself, returned from the death (yeah two of them, one in celementtree and one in pyexpat), one is – probably, not sure though – in Firefox, and thus in other Mozilla products, I suppose, and the last one is in xmlrpc-c, which has one of the worst build-systems I ever seen and thus makes it quite hard to fix the issue entirely.

Maybe one day we’ll have just one copy of expat in any system, and that would be shared by everybody.. maybe.

Working under Windows, my personal hell

If I am to go to hell, I know already what it will look like: no Linux, no Mac OS X or any other Unix. Just Windows and (maybe) OS/2. And I’m still a programmer. And a system administrator at the same time.

It so happens that my current job requires me to work under Windows, to develop software, well, for Windows. For a series of reason that I don’t want to start explaining here, I decided to go with Borland, sorry, CodeGear C++ Builder as IDE rather than Microsoft’s or Qt. The main problem is that the software ha to be redistributed as proprietary, and cannot relay on stuff like .NET framework (otherwise I could have easily completed it already using Visual C# Express).

I have to admit I find myself way more comfortable with Borland, sorry, CodeGear rather than Microsoft’s sorta-C++ environment, mostly because I learnt real programming with BCB 3 (I had a “personal” license that was given for free with an old magazine years ago). I don’t really like much of the orientation that CodeGear has, but at least I can work with it without going crazy, which is decent anyway.

What is the problem? Well I didn’t have a Windows installation for about five years before, and my last license of Windows was Windows 95; I had to buy a Windows XP license (and it still costs €400 even though it has been released more than five years ago by now), and a license of CodeGear C++ Builder (electronic copy costs €100 less, but it still costs almost one grand). Then I had to get used again to working with VCL.

Not a big deal, mind you, but it reminds me why I so much like Qt, GCC and Emacs. Sure I could use these three on Windows, but not for what I need to do :/

On the other hand, I was able to use a piece of free software to save some of my time: rather than using the XML Writer interface as exported by MS XML services, I built libxml2 (which strangely enough supports Borland compiler natively) and used that, it features a very similar interface, but way nicer. The XPath interface is a bit messy (I was unable to find a way to execute recursive XPaths, that is, after finding a node through XPath, I couldn’t find how to run a second XPath on that, so i had to complete the task with sequential access; if anybody knows how to do that I’d be glad to know). I sincerely find libxml2 could use some better API documentation, if I have more time I’ll gladly see to write it.

But it’s not even done here. I decided that running the virtual machine on a virtual disk on the laptop was being too slow, so I decided to use BootCamp to install on the real disk and use that through Parallels. Reinstalling everything is a pain especially when Windows seem to require ten runs of Windows Update to get the updates right. And users complain about having to use --resume --skipfirst with Gentoo from time to time ;)

Right now I am storing my work data on a virtual hard drive still, as I couldn’t give enough space on the real disk for Windows, and of course Windows does not support the GPT partition scheme I use on the external Firewire drive. It’s frustrating that I can share that disk just fine with Linux and OSX but I’d need another hard drive to get it to share data with Windows. I suppose I should write that off for the future.

Using Parallels shared folder feature, by the way, seems to be quite impossible with development environments: .NET based stuff won’t run the applications with full privileges because they are seen as coming through the network; CodeGear RAD Studio tries to validate the hostname (.PSF) and as it is invalid it fails to open any file that resides on it (unless you map it to a network drive), the Borland Incremental Linker (ilink32) fails because Parallels uses case-sensitive lookup for files, while ilink32 looks for all-caps filenames (MainUnit.cpp becomes MainUnit.obj, but the linker looks for MAINUNIT.OBJ).

I should probably put the subversion repository for my work on Enterprise, but I don’t wan to access it through SSh as it would mean adding a private key able to access Enterprise to Windows…

I sincerely hope my next jobs will stay under Linux for a while, after these two are done :)