Newcomers, Teachers, and Professionals

You may remember I had already a go at tutorials, after listening in on one that my wife had been going through. Well, she’s now learning about C after hearing me moan about higher and lower level languages, and she did that by starting with Harvard’s CS50 class, which is free to “attend” on edX. I am famously not a big fan of academia, but I didn’t think it would make my blood boil as much as it did.

I know that it’s easy to rant and moan about something that I’m not doing myself. After all you could say “Well, they are teaching at Harvard, you are just ranting on a c-list blog that is followed by less than a hundred people!” and you would be right. But at the same time, I have over a decade of experience in the industry, and my rants are explicitly contrasting what they say in the course to what “we” do, whether it is in opensource projects, or a bubble.

I think the first time I found myself boiling and went onto my soapbox was when the teacher said that the right “design” (they keep calling it design, although I would argue it’s style) for a single-source file program is to have includes, followed by the declaration of all the functions, followed by main(), followed by the definition of all the functions. Which is not something I’ve ever seen happening in my experience — because it doesn’t really make much sense: duplicating declarations/definitions in C is an unfortunate chore due to headers, but why forcing even more of that in the same source file?

Indeed, one of my “pre-canned comments” in reviews at my previous employer was a long-form of “Define your convenience functions before calling them. I don’t want to have to jump around to see what your doodle_do() function does.” Now it is true that in 2020 we have the technology (VSCode’s “show definition” curtain is one of the most magical tools I can think of), but if you’re anyone like me, you may even sometimes print out the source code to read it, and having it flow in natural order helps.

But that was just the beginning. Some time later as I dropped by to see how things were going I saw a strange string type throughout the code — turns out that they have a special header that they (later) define as “training wheels” that includes typedef char *string; — possibly understandable given that it takes some time to get to arrays, pointers, and from there to character arrays, but… could it have been called something else than string, given the all-too-similarly named std::string of C++?

Then I made the mistake of listening in on more of that lesson, and that just had me blow a fuse. The lesson takes a detour to try to explain ASCII — the fact that characters are just numbers that are looked up in a table, and that the table is typically 8-bit, with no mention of Unicode. Yes I understand Unicode is complicated and UTF-8 and other variable-length encodings will definitely give a headache to a newcomer who has not seen programming languages before. But it’s also 2020 and it might be a good idea to at least put out the idea that there’s such a thing as variable-length encoded text and that no, 8-bit characters are not enough to represent people’s names! The fact that my own name has a special character might have something to do with this, of course.

It went worse. The teacher decided to show some upper-case/lower-case trickery on strings to show how that works, and explained how you add or subtract 32 to go from one case to the other. Which is limited not only by character set, but most importantly by locale — oops, I guess the teacher never heard of the Turkish Four Is, or maybe there’s some lack of cultural diversity in the writing room for these courses. I went on a rant on Twitter over this, but let me reiterate this here as it’s important: there’s no reason why a newcomer to any programming language should know about adding/subtracting 32 to 7-bit ASCII characters to change their case, because it is not something you want to do outside of very tiny corner cases. It’s not safe in some languages. It’s not safe with characters outside the 7-bit safe Latin alphabet. It is rarely the correct thing to do. The standard library of any programming language has locale-aware functions to uppercase or lowercase a string, and that’s what you need to know!

Today (at the time of writing) she got to allocations, and I literally heard the teacher going for malloc(sizeof(int)*10). Even to start with a bad example and improve from that — why on Earth do they even bother teaching malloc() first, instead of calloc() is beyond my understanding. But what do I know, it’s not like I spent a whole lot of time fixing these mistakes in real software twelve years ago. I will avoid complaining too much about the teacher suggesting that the behaviour of malloc() was decided by the clang authors.

Since there might be newcomers reading this and being a bit lost of why I’m complaining about this — calloc() is a (mostly) safer alternative to allocate an array of elements, as it takes two parameters: the size of a single element and the number of elements that you want to allocate. Using this interface means it’s no longer possible to have an integer overflow when calculating the size, which reduces security risks. In addition, it zeroes out the memory, rather than leaving it uninitialized. While this means there is a performance cost, if you’re a newcomer to the language and just about learning it, you should err on the side of caution and use calloc() rather than malloc().

Next up there’s my facepalm on the explanation of memory layout — be prepared, because this is the same teacher who in a previous lesson said that the integer variable’s address might vary but for his explanation can be asserted to be 0x123, completely ignoring the whole concept of alignment. To explain “by value” function calls, they decide to digress again, this time explaining heap and stack, and they describe a linear memory layout, where the code of the program is followed by the globals and then the heap, with the stack at the bottom growing up. Which might have been true in the ’80s, but hasn’t been true in a long while.

Memory layout is not simple. If you want to explain a realistic memory layout you would have to cover the differences between physical and virtual memory, memory pages and pages tables, hugepages, page permissions, W^X, Copy-on-Write, ASLR, … So I get it that the teacher might want to simplify and skip over a number of these details and give a simplified view of how to understand the memory layout. But as a professional in the industry for so long I would appreciate if they’d be upfront with the “By the way, this is an oversimplification, reality is very different.” Oh, and by the way, stack grows down on x86/x86-64.

This brings me to another interesting… mess in my opinion. The course comes with some very solid tools: a sandbox environment already primed for the course, an instance of AWS Cloud9 IDE with the libraries already installed, a fairly recent version of clang… but then decides to stick to this dubious old style of C, with strcpy() and strcmp() and no reference to more modern, safer options — nevermind that glibc still refuses to implement C11 Annex K safe string functions.

But then they decide to not only briefly show the newcomers how to use Valgrind, of all things. They even show them how to use a custom post-processor for Valgrind’s report output, because it’s otherwise hard to read. For a course using clang, that can rely on tools such as ASAN and MSAN to report the same information in more concise way.

I find this contrast particularly gruesome — the teacher appears to think that memory leaks are an important defect to avoid in software, so much so that they decide to give a power tool such as Valgrind to a class of newcomers… but they don’t find Unicode and correctness in names representation (because of course they talk about names) to be as important. I find these priorities totally inappropriate in 2020.

Don’t get me wrong: I understand that writing a good programming course is hard, and that professors and teachers have a hard job in front of them when it comes to explain complex concepts to a number of people that are more eager to “make” something than to learn how it works. But I do wonder if sitting a dozen professionals through these lessons wouldn’t make for a better course overall.

«He who can, does; he who cannot teaches» is a phrase attributed to George Bernand Shaw — I don’t really agree with it as it is, because I met awesome professors and teachers. I already mentioned my Systems’ teacher, who I’m told retired just a couple of months ago. But in this case I can tell you that I wouldn’t want to have to review the code (or documentation) written by that particular teacher, as I’d have a hard time keeping to constructive comments after so many facepalms.

It’s a disservice to newcomers that this is what they are taught. And it’s the professionals like me that are causing this by (clearly) not pushing back enough on Academia to be more practical, or building better courseware for teachers to rely on. But again, I rant on a C-list blog, not teach at Harvard.

Boosting my morale? I wish!

Let’s take a deep breath. You probably remember I’m running a tinderbox which is testing some common base system packages before they are unmasked (and thus unleashed on users); in particular I use it for testing new releases of GCC (4.7) and GLIBC (2.16).

It didn’t take me much after starting GLIBC 2.16 testing to find out that the previously-latest version of Boost (1.49) was not going to work with it. The problem is that there is a new definition that both of them tried to provide, TIME_UTC (probably relates to C++11/C11). Now unfortunately since it’s an API breakage to replace that definition, it means that it can’t be applied to the older versions, and it means that packages need to be fixed. Furthermore, the new 1.50 version has also broken the temporary compatibility introduced in 1.48 for their filesystem module’s API. This boils down to a world of pain for maintainers of packages using Boost (which includes yours truly, luckily none is directly maintained by me, just proxy).

So I had to add one extra package to the list, and ran the reverse dependencies — the positive side is that it didn’t take long to fill the bug although there are still a few issues with older boost versions not being supported yet. This brought up a few issues though…

First problem is the way Boost build itself, and its tests, is obnoxious: it’s totally serial, no parallelisation at all! The result is that to run the whole testsuite it takes over eight hours on Excelsior! The big issue is that for the testing, it takes some 10-20 times longer to build the test than run it (lovely language, C++), so a parallel build of the tests, even if the tests were executed in series, would mean a huge impact, and would also likely mean that the tests would become meaningful. As they sit, the (so-called) maintainer of the package has admitted to not run them when he revbumps, but only on full new versions.

The second problem is how Boost versions are discovered. The main issue is that Boost, instead of using proper sonames to keep binary compatibility, embeds its major/minor version pair in the library name — although most distributions symlinks the preferred version to the unversioned name (in Gentoo this is handled through the eselect boost tool). This is not extremely far from what most distributions do with Berkeley DB — but it causes problem when you have to find which one you should link to, especially when you consider that sometimes the unversioned name is not there at all.

So both CMake and Autotools (actually Autoconf Archive) provide macros to try a few different libraries. The former does it almost properly, starting from the highest version and then going in descending order — but uses a pre-defined list of versions to try! Which mean that most packages with CMake will try 1.49 first, as they don’t know that 1.50 is out yet! If no known version is found, then it will fallback to the unversioned library, which makes it work quite differently whether you have only one or more than one version installed!

For what concerns the macros from Autoconf Archive, they are quite nasty; from one side they really aren’t portable at all, as they use GNU sed syntax, they use uname (which makes them totally useless during cross-compilation), but most worrisome of all, is that they use ls to find which boost libraries are available and then take the first one that is usable. This means that if you have 1.50, 1.49 and 1.44 installed, it’ll use the oldest! Similarly to CMake, it uses the unversioned library last. In this case, though, I was able to improve the macros by reversing the check order, which makes them work correctly for most distributions out there.

What is even funnier about the AX macros (that were created for libtorrent, and are used by Gource, which I proxy-maintain for Enrico), is that due to the way they are implemented, it is feasible that they end up using the old libraries and the new headers (it was the case for me here with 1.491.50, as it didn’t fail to compile, just to link). As long as the interface used have different names and the linker will error out, all is fine. But if you have interfaces that are source-compatible, linker-compatible, but with different vtables, you have a crash waiting to happen.

Oh well…

Debunking x32 myths

There has been many comments on my previous post about the new x32 ABI; some are interesting, others are more “out there” — the feeling I get is that there is quite a bit of cargo culting, with people thinking “there has to be a reason why is is developed, so it’ll be good for me!” without actually having the technical background to judge the usefulness of all this.

So in the same spirit with which I commented on ccache almost exactly four years ago (wow, I have been keeping a blog for a very long time, haven’t I?), I’ll try to debunk a few of the myths and misconception around this new ABI.

The new x32 ABI has proven to be faster. Not really; what we have right now are a few benchmarks, published by those who actually created the ABI, Of course you’d expect that those who spent time to set it up found it interesting and actually faster, but I honestly have doubts about the results, for reasons that will be clearer by reading the next few entries.

It’s also interesting to note that while the overall benchmarks seem to be positive, the numbers are quite close in general.. and even Intel’s presentation only gives you actual “big” numbers only when comparing with the original x86 ABI — which nobody is saying is better than what x86-64 is!

The data is also coming from a synthetic test, not from an actual overall system usage, and if you have any clue about benchmarks you know that the numbers can easily lie out of their teeth!

The new ABI generates smaller code, which means more instruction will fit in cache, and you’ll have smaller files as well. This is absolutely false. Not only the code generated is generally the same as x86-64 (you’re not changing the instruction set at all, you’re just changing the so-called “data model”, which means you change the size of long (and related types) and of the pointers (and thus the address space).

From one side it is theoretically correct that you’re going to have smaller data structures, which means you can make better use of the data cache (not of the instruction cache, be sure!) — but is this the correct approach? In my informed opinion, it should be a better idea to look into actually writing code that considers the cachelines, if your code is cache-hungry! You can use dev-util/dwarves which is a set of utilities by Arnaldo (acme) — pahole will tell you how your data structures will be split in memory.

Also remember that for compatibility the syscalls are kept the same with x86-64, which means that all the kernel code executed, and all the data structures that are shared with the kernel are the same as x86-64 (which means that a number of data structures won’t even change their size with the new ABI).

Actually, referring again to the same slides you can see on on slide 24 that the x32 code can be longer than x86’s original code — it would have been nice if they included the same code in x86-64, especially since I don’t speak VCISC, but I think it’s just the same code.

It might be of interest to compare the size of the file itself; this is the output of rbelf-size from my Ruby Elf suite:

        exec         data       rodata        relro          bss     overhead    allocated   filename
     1239436         7456       341974        13056        17784        94924      1714630   /lib/
     1259721         4560       316187         6896        12884        87782      1688030   x32/

The executable code is actually bigger in the x32 variant — the big change is of course in the data sections (data, rodata, relro and bss) as the pointers have been halved — I honestly wonder how’s it possible for the C library to have so many pointers in its own structures, but it’s a question beside the point. Even if these numbers are halved, the difference is not that big, in total you have something along the lines of 30KB less data allocated, which is unlikely to even change the memory map.

The data size reduction is useful. Okay this seems to be a common issue. Sure it is the case that the data structures are smaller with x32, that’s its design after all. The main question would probably be “is this significant?” — I don’t think it is. Even in the example above with the C library, the difference while still “big enough”, is just under 20% of the allocated space … of the C library! A library that is supposed to implement the very minimal interface.

Now if you add up all the possible libraries, you probably can shave off a few megabytes of data of course but … you’ll have to add in all the porting issues that I’m going to discuss soon. Yes it is true that C++ and most VM languages will have less pressure, especially when copying objects, thanks to the reduced pointers’ size, but this is still quite a stretch. Especially since for the most part you’ll have to keep data buffers aligned to at least 8 bytes (64-bit) to make use of the new instructions — you already to align them to 16 bytes (128-bit) to make use of some SIMD sets.

And for those who think that x32 is reducing the size of files on disk — remember that as it is you can’t run a pure-x32 install; what you get is usually going to be a mix of three ABIs: x86, amd64 and x32!

But there is no reason for $application to deal with more than 4GiB memory. Yes of course that might be true, but really, do you care about the pointer size? If you really want to make sure that the application doesn’t use more than a given amount of memory, use system limits! They are definitely less intrusive than building a new ABI altogether.

Interestingly there are two way different, contrasting, applications of a full 64-bit address space on systems with less than 4GiB of RAM: ASLR (Address Space Layout Randomization — which can really load the various objects an application require at widely different addresses), and Prelink (which can then make sure that every unique object on the system is always loaded at the same address, yes that’s really the opposite of what ASLR does!).

Applications use long but they don’t need the full 64-bit space. And of course the solution is to create a new ABI for it, according to some people.

I’m not going to say that there are many applications that still use long without a clue on why they do that; they probably have some very little range of values they want to use and yet they use “big values” such as long, as they probably learnt programming on systems that use it as a synonym for int — or even better they learnt programming on systems where long is 32-bit but int is 16-bit (hello MS-DOS!).

The solution to this is simply to use the standard integers provided by stdint.h such as uint32_t and int16_t — so that you always use the data size you’re expecting and needing! This also has the side-effect of working on many more systems than you expect, and works with FFI and other techniques.

Hand-coded assembly is rare. This is one thing a few people told me after my previous post as I complained about the fact that with the new ABI as it is we’re losing most of the hand-coded assembly. This might strictly be true, but it might be less rare than you think. Even excluding all the multimedia software, crypto software usually makes good use of SIMD as well, and that’s done through hand-coded assembly, not through the compiler’s intrinsics.

There is also another issue with hand-coded assembly in software such as Ruby — while Ruby 1.9 fails to build on x32, it gets much more interesting on Ruby 1.8 because while it builds just file, it_segfaults at runtime_. Reminds you of something?

Furthermore, it’s the C library itself that comes with most of the handcoded assembly — the only reason why you don’t feel the porting pressure is simply that H.J. Lu that takes care of most of those is one of the authors of the new ABI, which means the code is already ported there.

x32 is going to be compatible with x86, if not now in the future. Okay this I didn’t have a comment about before, but it’s one misconception I’ve noticed before being thrown around. Luckily, the presentation comes to help, slide 22 makes it very clear that the ABI are not compatible. Among other things you have to consider that the x32 ABI at least corrects some of the actual mistakes in x86, including the use of 32-bit data types for off_t and similar. Again, something I talked about two years ago.

This is the future of 64-bit processors. No; again refer to the slides in particular slide 10. This has been explicitly designed for closed systems rather than as a replacement for x86-64! How does that feel now?

The porting effort is going to be trivial, you just have to change the few lines of assembler and change the size of pointer arithmetic. This is not the case. The porting requires a number of other issues to be tackled, and handcrafted assembly is just the tip of the iceberg. Breaking the assumption that x86-64 has 64-bit pointers is, by itself, quite a big deal, but not as big as one might assume at first (it’s the same way on Windows), what I think is going to be a big issue is going to be the implementation of FFI style C bindings — remember I said it wasn’t an easy answer?

CPUs perform better on 32-bit operands than 64-bit. Interestingly, the only CPU that Intel admits do perform better on 32-bit on the presentation I already linked a few times, is the Atom — the quote is actually “64bit imul latency is twice of 32bit imul on Atom”.

Now, what the heck is imul? That’s a signed multiply operation. Do you multiply pointers? It doesn’t make sense. Besides, pointers are not signed. Are you telling me that your most concern is for a platform (Atom) that has extra latency on an operation when people use 64-bit data types and they should instead use 32-bit? And your solution for that concerns is to create a new ABI where it’s harder to use 64-bit data types instead of going to fix whatever program is causing the problem?

I guess I should end it here, because this last note about the Atom and imul is probably going to make the day of most people who have half a clue.

Why Foreign Function Interfaces are not an easy answer

With the term FFI you usually refer to techniques related to GCC’s libffi and their various bindings, such as Python’s ctypes. The name should instead encompass a number of different approaches that work in very different ways.

What I’m going to talk about is a subset of FFI techniques that work the way FFI works, which means they also cover .NET’s P/Invoke — which I briefly talked about in an old post.

The idea is that the code for the language you’re writing in, declares the arguments that the foreign language interfaces are expecting. While this works in theory, it has quite a few practical problems, which are not really easy to see, especially for developers whose main field of expertise is interpreted languages such as Ruby, or intermediate ones like C#. This, just because the problems are related to the ABI: Application Binary Interface.

While the ABI for C and C++ is quite different, I’ll start with the worst case scenario, and that is using FFI techniques for C interfaces. A C interface (a function) is exposed only through its name, and no other data; the name does not encode either the number of the type of parameters, which means that you can’t reflectively load the ABI based off the symbols in a shared object.

What you end up doing, in these cases, is declare in the Ruby code (or whatever else, I’ll stick with Ruby because that’s where I usually have experience) the list of parameters to be used with that function. And here it gets tricky: which types are you going to use for parameters? Unless you’re sticking with C99’s standard integers, and mostly pure functions, you’re going to have trouble, sooner or later.

  • the int, long and short types do not have fixed sizes, and depending on the architecture and the operating system they are going to be of different size; Win64 and eglibc’s x32 are making that even more interesting;
  • the size of pointers (void*) depends once again on the operating system and architecture;
  • some types such as off_t and size_t depends not just on the architecture and operating systems but also on the configuration of said system: on glibc/x86, by default they are 32-bit, but if you do enable the so-called largefile support they are 64-bit (the same goes with st_ino as that post suggest);
  • on some architectures, the char type is unsigned, on others it is signed, which is one of the things that made PPC porting quite interesting, if you weren’t using C99’s types;
  • if structures are involved, especially with bitfields, you’re just going to suffer, since the layout of the structure, if not packed, depends on both the size of the fields and the endianness of the architecture — plus you have to factor in the usual chance for difference due to architecture and operating system.

Up until now, the situation doesn’t seem to be unsolvable; indeed it should be quite easy, if not for the structures, if you create type mappings for each and every standard type that could change, and make sure developers use them… of course things don’t stop there.

Rule number one of the libraries’ consumer: ABI changes.

If you’re a Gentoo user you’re very likely to be on not-too-friendly terms with revdep-rebuild or the new preserved libraries feature. And you probably have heard or read that the issue with requiring to rebuild other packages is that one of the dependencies changed ABI. To be precise, what changes in those situation is the soname which is declaring the library changed ABI, which is nice of them.

But most of the changes in ABI are not declared, either by mistake or for proper reasons. In the former case, what you have is a project that didn’t care enough about its own consumers and didn’t make sure that its ABI is compatible one release with the other, and that didn’t follow soname bumping rules, which is actually all too common. In the latter scenario, instead, you have a quite more interesting situation, especially for what FFI is concerned.

There are some cases where you can change ABI, and yet keep binary compatibility. This is usually achieved by two types of ABI changes: new interfaces and versioned interfaces.

The first one is self-explanatory: if you add a new exported function to a library, it’s not going to conflict with the other exposed interfaces (remember I’m talking about C here; this is not strictly true for C++ methods!). Yet that means thast the new versions of the library have functions that are not present in the older ones — this, by the way, is the reason why downgrading libraries is never well-supported, especially in Gentoo (if you rebuilt the library’s consumers, it is possible that they used the newly-available functions — they wouldn’t be there after the downgrade, and yet the soname didn’t change, so revdep-rebuild wouldn’t flag them as bad).

The second option is trickier; I have written something about versioning before, but I never went out of my way to describe the whole handling of it. Suffice to say that by using symbol versioning, you can get an ABI-compatible change, for an API-compatible change that would otherwise break the ABI.

A classical example is moving from uint32_t to uint64_t for the parameters of a function: changing the function declaration is not going to break API because you’re increasing the integer size (and I explicitly referred to unsigned integers so you don’t have to worry about sign extension), so a simple rebuild of the consumer would be enough for the change to be integrated. At the same time, such a change in the C ABI would make the change incompatible, as the size of the parameters on the stack doubled, so calls to the previous API would crash on the new one.

This can be solved – if you used versioning to begin with (due to the bug in glibc I discussed in the article linked earlier) – by keeping a wrapper around the new API which still uses the old one, and making each of them use a new version for the symbol. At that point, the programs built against the old API will keep using the symbol with the original version (the wrapper), while the new ones will build straight to the new API. There you are: compatible API change leads to compatible ABI change.

Yes I know what you’re thinking: you can just add a suffix to the function and use macros to switch consumers to the new interface, without using versioning at all; that’s absolutely true, but I’m not trying to discuss the merits of symbol versioning here, just explaining how it connects to FFI trouble.

Okay, why is it all this relevant then? Well, what the FFI techniques use to load the libraries they wrap around is the dlopen() and dlsym() interfaces; the latter in particular is going to follow the step of the link editor, when a symbol with multiple versions is encountered: it will use the one that is declared to be the “default symbol”, that is, the latest added (usually).

Now return to the example above: you have wrapped through FFI the function to require two parameters as uint32_t, but now dlsym() is loading in its place a function that expects two uint64_t parameters.. there you are, your code has just crashed.

Of course it is possible to override this throught he use of dlvsym() but that’s not optimal because, as far as I can tell, it’s a GNU extension, and most libraries wouldn’t be caring about that at all. At the same time, symbol versioning, or at least this complex and messed up version of it, is mostly exclusive to GNU operating systems, and its use is discouraged for libraries that are supposed to be portable… the blade there is two-sided.

Since these methods are usually just employed by Linux-specific libraries, there aren’t so many that are susceptible to this kind of crash; on the other hand, since most non-Linux systems don’t offer this choice, most Ruby developers (who seem to use OS X or Windows, seeing how often we encountered case-sensitivity issues compared to any other class of projects) would be unaware of its very existence…

Oh and by the way, if your FFI-based extension is loading without any soversion, you’re not really understanding shared objects, and you should learn a bit more about them before wrapping them.

What’s the morale? Be sure about what you want to do: wrapping C-based libraries often is a good choice to avoid reimplement everything, but consider if it might not be a better idea to write the whole thing in Ruby, it might not be so time-critical as you think it is.

Writing C-based extension move the compatibility issues at build-time, which is a bit safer: even if you write tests for each and every function you wrap (which you should be doing), the ABI can change dynamically when you update packages, making install-time tests not much reliable for this kind of usage.

Tinderbox summary for May 2010: GCC 4.5, Berkeley DB 5.0; libpng 1.4

I’m a bit surprised sincerely, and not exactly in a good way, since I started with GCC 4.5 just over a month ago, that the tinderbox almost caught up with its queue already.

Now, admittedly part of the reason might be related to my optimisation of the filesystems and partitions — especially after last week, since I moved all the stuff around to divide it into three pairs of disks: two 320GB WD RAID Edition disks for the RAID1 with the OS, my home and work stuff; two 500G Samsung disks for the “scratch” partitions (/var/tmp, the tinderbox’s filesystem), and finally two 1TB WD Caviar Green for storage space (multimedia files, including samples, and distfiles, 150GB of them!).

What makes me doubtful regarding the goodness of this situation is that a lot of packages were skipped because of dependencies failing. With GCC 4.5 we have no MySQL; with Berkeley DB 5.0 we have no Apache (because of apr-util). Without those, the tinderbox drops tons and tons of packages, a whole deptree of packages that will not be tested until the roots are fixed.

At any rate, now that I actually went through the packages, I can finally say what the most common problems with GCC 4.5 are. And surprisingly, it comes down to mostly two problems, a nasty runtime one, and one “usual” boring one.

First of all, the nasty one: GCC now seem to provide runtime-based overflow protection, not totally unlike the Stack Smashing Protection that the Gentoo Hardened project used to provide (and thanks to Magnus might come back at providing); this is a good thing from one side, because overflow protection is a nice safety feature (if not a security one), but it also means that we’re going to find a lot of packages failing at runtime because of this, and that stuff is much harder to deal with; one such package is the TCL interpreter, that is overflowing at runtime for so many packages that it’s boring. The problems tied to these features have their own tracker that was started back into 4.3 series already.

The boring problem is, once again, related to C++ (can you see why Luca is so worried now?): for some reason, up until now GCC supported one very strange syntax for it:

Foo::Bar x = Foo::Bar::Bar();

This basically consists of explicitly calling the constructor function of the class, rather than using the constructor through conversion. I would have always considered this syntax invalid, since I started learning the language, but I can tell how it could be typed wrongly; what surprised me is that it was allowed before. Sigh. The fact that Free Software has become a strict GCC monoculture does not help here, it means that instead of actually being tested, the code is just accepted if GCC supports it. It sucks now that LLVM seems to become more interesting.

The Berkeley DB situation is much worse, I’m afraid, in term of time needed to solve it; the main problem there is that a lot of packages that fail with it fall into the mail software categories, and the net-mail team is near non-existant for way too long now. This can be noted by the fact that we have stuff like mailx failing, continuous file collisions that are not being solved (and the tentatives with the “mailwrapper” stuff resulted in a total revert), and generally broken and out of date packages.

We could use a few more “mailmen” working in Gentoo, since I most definitely have just barely enough clue to manage my postfix installs with the help of The Definitive Guide (that’s one of the best thing I could buy from O’Reilly, without that I’d be seriously screwed).

The importance of opaque types

I sincerely don’t remember whether I already discussed about this before or not; if I didn’t, I’ll try to explain here. When developing in C, C++ and other languages that support some kind of objects as type, you usually have two choices for a composited type: transparent or opaque. If the code using the type is able to see the content of the type, it’s a transparent type, if it cannot, it’s an opaque type.

There are though different grades of transparent and opaque types depending on the language and the way they would get implemented; to simplify the topic for this post, I’ll limit myself to the C language (not the C++ language, be warned) and comment about the practises connected to opaque types.

In C, an opaque type is a structure whose content is unknown; this usually is declared in ways such as the following code, in a header:

struct MyOpaqueType;
typedef struct MyOpaqueType MyOpaquetype;

At that point, the code including the header will have some limitations compared to transparent types; not knowing the object size, you cannot declare objects with that type directly, but you can only deal with pointers, which also means you cannot dereference them or allocate new objects. For this reason, you need to provide functions to access and handle the type itself, including allocation and deallocation of them, and these functions cannot simply be inline functions since they would need to access the content of the type to work.

All in all you can see that the use of opaque types tend to be a big hit for what concerns performance; instead of a direct memory dereference you need always to pass through a function call (note that this seems the same as accessor functions in C++, but those are usually inline functions that will be replaced at compile-time with the dereference anyway); and you might even have to pass through the PLT (Procedure Linking Table) which means further complication to get to the type.

So why should you ever use opaque types? Well they are very useful when you need to export the interface of a library: since you don’t know either the size or the internal ordering of an opaque type, the library can change the opaque type without changing ABI, and thus requiring a rebuild of the software using it. Repeat with me: changing the size of a transparent type, or the order of its content, will break ABI.

And this gets also particularly important when you’d like to reorder some structures, so that you can remove padding holes (with tools like pahole from the dwarves package, see this as well if you want to understand what I mean). For this reason, sometimes you might prefer having slower, opaque types in the interface, instead of faster but riskier transparent types.

Another place where opaque types are definitely helpful is when designing a plugin interface especially for software that was never designed as a library and has, thus, had an in-flux API. Which is one more reason why I don’t think feng is ready for plugins just yet.

Another C++ piece hits the dust

You might remember my problem with C++ especially on servers; yes that’s a post of almost two years ago. Well today I was able to go one step further, and kill another piece of C*+ from at least my main system (it’s going to take a little more for it to apply to the critical systems). And that piece is nothing less than groff.

Indeed, last night we were talking in #gentoo-it about FreeBSD’s base system and the fact that, for them similarly to us, their only piece of C++ code in base system is groff; an user pointed out that they were considering switching to something else, so a Google run later I come up with the heirloom project website.

The heirloom project contains some tools ported from the OpenSolaris code base, but working fine on Linux and other OSes; indeed, they work quite well in Gentoo, after creating an ebuild for them, removed groff from profiles, and fixed the dependencies of man and zsh.

A few notes though:

  • the work is not complete yet so pleas don’t start already complaining if something doesn’t look right;
  • man is not configured out of the box; I’m currently wondering what’s the best way to do this;
  • after configuring (manually) man, you should be able to read most man pages without glitches;
  • for some reason, we currently install man pages in different encodings (for instance man’s own man page in Italian is written in latin-1); heirloom-doctools use UTF-8 by default, which is a good thing, I guess; groff does seem to have a lot of problems with UTF-8 (and man as well, since the localised Italian output often have broken encoding!);
  • groff (and man) both have special handling of Japanese for some reason, I don’t know whether the heirloom utilities are better or worse for Japanese, somebody should look into it.

C libraries galore

Seems like it’s that time of the year again when the new glibc is released that will break a bunch of packages (or rather make them fail, given that half of those are actually broken from the start). So this makes it the perfect time to explain what’s going on and why, even though Drepper already explained the changes in glibc 2.10 (and I got to say that while they don’t look tremendously cool, it’s quite interesting to me).

There are quite a few interesting changes for which glibc 2.10 is quite desired, for instance the new (more scalable) malloc implementation, or the new faster string functions for x86-64 (amd64), or for ELF geeks like me the STT_GNU_IFUNC symbol type (that I have yet to understand fully, for now Ruby-Elf is recognizing it but not acting on it). But all this is something that you can read directly on Ulrich’s blog so I won’t write extensively about it.

Instead, as usual, I’m going to write what the problematic changes are that can cause software from not building properly; this time there are mainly two changes that cause a problem, but there are a few more exotic and rare ones: the new POSIX functions, and the extended C++ compliance (given that this implements something that has been specified in the C++ 1998 standard, it should really give a hint on how much well supported C++ is: 11 years to implement a proper disambiguation!).

Many of the new XPG7/POSIX 2008 functions are actually “backports” of GNU extensions; being GNU extensions there is nothing new to the C library, but they are now declared for the C compiler to know about even when the _GNU_SOURCE feature macro is not defined. Since some of these have pretty generic names, when a project define its own version of the function, especially with a different interface, the build is going to error out. I would say that the most common function that is now visible that wasn’t is getline() (which is pretty neat actually to finally have available in POSIX). Since the name is pretty generic, lots of software has implemented one way or another a getline() function, which almost never does the same as the GNU/POSIX function (which reminds me that symbol collisions are bad and that I really should go back at that work one day).

To solve the problem with naming collisions (which, contrary to symbol collisions, are caught by the C compiler and cause the compiler to stop) the solution is to rename the functions themselves; one has to mind, though, that adding an underscore prefix is bad because those are reserved for system symbols and should thus not be used!

The second problem, the one with C++, is actually an error in the code (given that this is actually mandated by the C++ language in 1998), that now gets finally caught by the compiler; while in C casting away the const-ness of a variable only issues a warning by default, in C++ it’s a real mistake. Unfortunately to be useful the C string functions (like strchr() or strstr() accept pointers to constants as parameters and return pointers to variables, because you cannot easily disambiguate between the two. On the other hand, you can disambiguate the two functions in C++, so that they return the same type of pointer than they are given in input, and this is what the new glibc does: returning constants to constants and variables to variables.

Unfortunately there is code out there in the wild, actually pretty common too, that used pointers to variables to collect the returns of string functions called with pointers to constant . This does not technically mean that the code is entirely broken – often times, like in the case of mediatomb, it’s just a matter of not using the proper const modifier on the variable declaration – but sometimes, the brokenness does not cause crashes for I don’t know which reason, since data that is in the read-only memory of the process is tentatively modified.

Up to this point, the problems are mostly trivial and easy to fix; unfortunately it doesn’t stop here; at least in the case of libmp4v2 the problem was an incompatible declaration of the strcasestr() function (which changed prototype for the above-described C++ changes). Turned out the headers always declared strcasestr() to have it visible (if the C library isn’t providing it, they provide their own copy), and obviously that declaration is incompatible with the new ones used by glibc 2.10. Autotools mojo fixed the issue.

So hopefully this post will keep the mind fresh for those who are looking into fixing build failures with glibc-2.10. Have fun!

A “new” C library

Debian announced they are going to move away from the GNU C library toward eglibc, a derivative designed to work on embedded systems; not even a few hours after I shared it on Google Reader myself, I was contacted regarding a Gentoo bug for it . Since I don’t like repeating myself too much, I’m just going to write here what I think.

First of all, the idea is interesting, especially for the embedded developer in me (which is still waiting for the time to go buy the pins to solder on a serial port on my WRT54GL ), but also for the “alternative” developer in me. I have worked on Gentoo/FreeBSD and I always hoped to find a way to handle an uclibc chroot to test my own stuff. Testing eglibc is going to be interesting too (if only I had time to finish analysing the tinderbox logs, that is).

What I do find quite unfunny, and a bit discomforting, is that half the “features” that Aurélien Jarno lists are “we don’t need to deal with Drepper”. Now, I agree that Ulrich is not the best person to deal with (although I’d sooner deal with him than with Ciaran, but that’s another problem *edit: since it wasn’t really a good example here, I wish to explain it; it is public knowledge that me and Ciaran don’t get along too well, I don’t like his solutions and his methods; on the other hand, while I dislike Ulrich’s methods too, I have less problems with his solution, and I never had a personal quarrel with him, there goes my “I’d sooner deal with him” comment* ), and I also agree that his ideas of “good” sometimes are difficult to share (especially when it comes to implementation of versioning for ELF symbols which gave me such an headache to replicate in Ruby-Elf). On the other hand, I wonder how much of that choice is warranted.

What most people seem to compare this to is the move from XFree to Xorg, or to the fork of cdrecord into cdrkit. I disagree with comparing this cases with those two. Both those were due to license issues, which, for people caring about the freedom of their software, are one of the most important issues (unless, of course, you just don’t care and go on forth with piracy — which is what actually brought me to a bit of a nervous point with ALTlinux in the past). While not having assholes around is probably as important (and I’d point to this book which I remember Donnie describing before; unfortunately I haven’t had the pleasure to read it yet), I still don’t see this like the brightest move Debian could have done.

More to the point, the cdrkit fork doesn’t look like one of the shiniest things in Free Software; while cdrecord is no longer the massively single point of failure for CD/DVD burning in Linux, one has to note that this is also due to other projects, like libburn, having had an injection of development and race toward support, once the idea that cdrecord wasn’t good to keep started flowing around. And the XFree to Xorg move was extremely helped (and made successful) by the fact that the developers for XFree itself moved out of the project toward Xorg.

I’m not criticising Debian’s move, I’m actually thrilled to see the results; I’m not criticising eglibc, I’m very interested in the project. I’m just trying to throw a water bucket over the people who seem to be on fire about eglibc now. I don’t see this like a huge paradigm change for now. Once eglibc has huge advantages (which for now I don’t see), we can probably get passionate about moving. Right now I don’t see this huge change, there are neat ideas and certainly it’s a good thing not to have assholes blocking the project, but is that enough?

Now, more to the Gentoo side of the thing; I’m not part of the toolchain team, so I don’t know if they have any particular, special plan about this. I would expect them not to for now at least; adding support for a new C library is not impossible, but it’s not easy either. I might be worrying for nothing, but I don’t trust the “100% compatibility” that Debian seems to ensure about EGLIBC versus GLIBC, even if it’s just bugfixes over a given GLIBC version (which would bring me to my hate of forking projects); I wrote some pieces about the difficulty of ABI compatibility and while I did also show how to keep backward compatibility, I said nothing about forward compatibility.

Also, I don’t trust Debian’s assurances that it works with all the software; open-source or not. The reason is not only to be found in Debian not being that reliable but also in the fact that we’re not Debian: we have differnet patchsets, Debian tends not to send stuff upstream, and so on. We also leave the user with full access to tinker with features, which would mean being able to disable certain stuff from eglibc (I’d welcome that, there are a few things that I’d like to disable on my server for instance); in turn this means that either some USE flag configurations will get unsupported (or unavailable), or we’re going to need special dependencies to ensure that certain pieces of the C library are enabled for eglibc. Bottom line: we’re going to need a long test period either way (to be noted that we have big problems even between minor bumps of the same C library – think glibc 2.8 or 2.9, or FreeBSD 6.3 – which means that we’re going to need lots of testing to move to a new C library altogether).

Adding to this, is the fact that adding a new C library will mean new profiles (new profiles to test too), and new stages. Which means more work for everybody. Doesn’t mean that it’s not going to happen, just that you can’t expect that to happen tomorrow.

The Mono problem

In the past I have said that I find C# a very nice language; I still maintain this position, the C# language, taken alone, is probably one of my favourites, this does not mean that I intend to rewrite my ruby-elf suite in C#, but I like it and if I were to have to write a GUI application for the GNOME desktop, not relying on any particular library, I’d probably choose C# and Mono.

Why do I say “not relying on any particular library”? Well, today I wanted to try writing some C# bindings for PulseAudio to make use of it for a project I was proposed (and I don’t think now is really feasible), and I went to read the documentation from the Mono project. Took me a while to digest it; even though I had some experience with writing bindings for Ruby with my “Rust”, and a long time ago I worked on implementing Python extensions to a program, the C# method of bindings software really escaped me for quite a while.

In all the interpreted languages I know to write bindings you start from the language the interpreter is written in (usually, C) and then define an interface to the language that calls into “native” functions that in turn can call the library you want to bind. This is how Ruby bindings are written, and how Rust works, it tells the Ruby interpreter that there is a class called in a certain way and it defines its method through C functions that are called back; then it takes care of marshalling and translating parameters around.

The C++ bindings work in a slightly different way: since C++ can be described as a superset of C from one point of view, and the design of the language allows a very high type compatibility between the two, included the calling conventions, you usually write C++ class interfaces that wrap around C functions calling. It’s a completely different paradigm than Ruby and Python, but it works because there is really no interpreter or library barrier between C and C++ after all.

Considering how Mono is implemented, I sincerely expected the bindings writing to be a mix of these two methods; it seems instead that it’s almost entirely a C+­+-style bindings interface. But with a nasty twist: in C+­+ you got direct access to the interface of the library you’re writing bindings of (its headers) through direct inclusion and an eventual extern "C" block; with C# you don’t have that at all, as far as I can see.

This means that you got to describe all the interfaces inside the C# code, and then write the marshalling code that can translate the parameters from C# objects to C types. The way the functions are loaded is similar to the standard dynamic linker interface (dlopen()) with all the problems connected to that: C++ interfaces are almost impossible to load, and you got to get the parameters just right, if you don’t it’s a catastrophe doomed to happen. And this is even trickier than linking libraries with pure dynamic linking.

The most obvious problem for those who had to deal with dlopen() idiosyncrasies, is that C# has fixed-sized basic integer types. This is good, but not all software uses fixed-sized parameters; off_t, size_t, long are all types that change their size depending on the operating system and the hardware architecture; off_t is not even of the same size on the same system because it depends on whether large file support is enabled or not, at least on glibc-based systems (most BSDs should have it fixed-sized but that’s about it). Since the C# code is generic and is supposed to be built just once, it’s not easy to identify the right interface for the system. You cannot just #ifdef it around like you would with C++ bindings.

But this is not the only problem; the one problem I noticed first is, again because of the lack of access to the #include headers, that constants might not be constant. Since I wanted to write bindings for PulseAudio, I started first with the “simple” interface, and I started finding the problem right away with the pa_stream_direction_t enumeration. While I could create my own C# enumeration for it, I have no guarantee that Lennart does not decide to change the values; while that is going to change the ABI of the package, for both native implementations and Ruby-style bindings, a rebuild is just enough, there is usually no need to change the sources when this kind of changes are made; for the C# bindings, you’d have to adapt the bindings every time.

This is probably why there aren’t many C# bindings for libraries that don’t use GObject (for that, you got the gapi tool that takes care of most of the work for you), and why the banshee project preferred to reimplement TagLib in C# rather than bind it (indeed, binding TagLib is far from an easy thing, like I can testify first hand). I’m afraid this is the “Achilles’ heel” of C# and Mono. While this makes it less likely to produce the “java crap effect” that I have written about almost four years ago by now (jeez has so much time passed?), it does reduce the ability of Mono to blend in with the rest of the modern Linux systems.

The effort required to maintain proper bindings for C projects in C# is even higher than it is to reimplement the same code, and that is really a big problem for blending; the only thing that it works well for is portability, especially when it comes to portability on Microsoft platforms. This is all fine and dandy if you need your software to bend that way, and I have to say I do know a couple of cases where that might be one of the important factors, but it comes to a pretty high toll. On the other hand, Ruby, Python, Vala and Java seem to have better chances for integration. All in all, I’m sincerely unimpressed. I like the language, I just don’t like the runtime or the way that’s going.