The importance of portability

As anybody who read my blog before could have seen, one of the things I’m more concerned with is portability. Not only between different OSes (Linux and FreeBSD) but also between different architecture (AMD64).
The portability is a simple process actually, you just need to care a bit more about the size of the datatypes and of their internal structure, not relying on the byte disposition (little endian or big endian), also because, as many developers pointed out for me in #gentoo-dev, we can’t be sure of the endianness neither on architectures which usually uses a given endianness, like PPC which can be different from implementation to implementation.

What I’m trying to say here is that portability is going from now on to be always more and more important: not only Linux and *BSD runs on a wide variety of hardware platforms, but now also Apple with their MacOS is going to change platform. What that means for us? Well most of the multi-platform projects will need to restructure the code if they was assuming that OSX == PPC == Big Endian. But how they can do that? There are out there some development kit but they aren’t ensured to be the final product; actually we can’t ensure what is going on in the market two months for now, how can we predict what will be the final hardware platform Apple will choose? We actually know that it will be an Intel chip and won’t be compatible with PPC, but which will be the internal structures is something we don’t know yet.

So what’s the deal then? What can software producers do now to avoid the hell of portability when it will come to the new platform?
The first problem is with size of integers and with printf arguments: fortunately there are the standard integer types intX_t uintX_t, which can usually be trusted without problems, and then there are the macros like PRId64 to select the right type for the output, just a bit more attention to this can avoid having to mess around with #ifdef conditionals to have the things working well.

The other problem is with endianness of types, and more specifically with data structures written in binary form on disk (in files). The solution to this is to specify their structure and endianness in a complete form, making sure that they are loaded with special functions used to preserve their endianness; also if the most widespread hardware platform currently available (Intel IA-32 and derived) and the one which is imposing itself in 64-bit desktop market (AMD x86_64 and derived) are littleendian, probably the most natural way to define data structures on disk is big endian. I know this can seem stupid, but the use of bigendian data structures on disk means that they can be seamlessy streamed over the network: quite all the binary protocols currently used uses big endian as endianness, that’s why it’s sometime called Network Endianness. It’s also the more logical natural form for numbers, for humans.

Not relying on data structure internal format can be sometime a bit slower than accessing them directly, as you needs to use shifts and bitwise and operations to get the part of the structure you need, but with today’s hardware such an overhead is probably not significative for the complessive performance of the system.

Then just remember: developing something portable from scratch is going to be a lot less of a pain than need to port something to a new architecture when the mainstream OS you are using is changing its platform, and this is true not only for Linux, *BSD, OS X and Windows programs but also for all the programs which wants to run on “strange” or new operating systems, or for the ones who wants to run on every operating system of the world… that can seem to be something too difficult to do, but looking for the right services of the operating systems, it’s not impossible to work on something which is so much abstract to run everywhere, without need to use something like Java or MONO/.Net and various virtual machines.

Anyway I really hope that with the new 4.x series of GCC a lot of the old, fooledup, really really bad styled code is going to become deprecated, and I’m going to do my part as much as I can.