ABI changes keeping backward compatibility

So I’ve written about what ABI is and what breaks it and how to properly version libraries and I left saying that I would be writing about some further details on how to improve compatibility between library versions. The first trick is using symbol versioning to maintain backward compatibility.

The original Unix symbol namespace used for symbols (which is the default one) was flat; that means that a symbol (function, variable or constant) is only ever identified by its name. To make things more versatile, Sun Solaris and GNU C library both implemented “symbol versioning”: a method to assign to the same function one extra “namespace” identifier; the use of this feature is split between avoiding symbol collisions and maintaining ABI compatibility.

In this post I’m focusing on using the feature to maintain ABI compatibility, by changing the ABI of a single function, simulating two versions of the same library, and yet allowing software built against the old version to work just fine with the new one. While it’s possible to achieve the same thing in many different systems, I’m going to focus for ease of explanation to glibc-based Linux systems, in particular on my usual Yamato Gentoo box.

While it does not sound like a real life situation, I’m going to use a very clumsy API-compatible function which ABI can be broken, and which simply reports the ABI version used:

/* lib1.c */
#include 

void my_symbol(const char *str) {
  printf("Using ABI v.1n");
}

The ABI break will come in the form of the const specifier being dropped; since this un-restricts what the function can do with the parameter, it is a “silent” ABI break (in the sense that it wouldn’t warn the user if we weren’t taking precaution). Indeed if we were having the new version written this way:

/* lib2.c */
#include 

void my_symbol(char *str) {
  printf("Using ABI v.2n");
}

the binaries would still execute “cleanly” reporting the new ABI (of course if we were changing the string, and the passed string was a static constant string, that would be a problem; but since I’m using x86-64 I cannot show that since it still does not seem to produce a bus error or anything). Let’s assume that instead of reporting the new ABI version the software would crash because something happened that shouldn’t.

The testing code we’re going to use is going to be pretty simple:

/* test.c */
#if defined(LIB1)
void my_symbol(const char *str);
#elif defined(LIB2)
void my_symbol(char *str);
#endif

int main() {
  my_symbol("ciao");
}

Now we have to make the thing resistant to ABI breakage, which involves using inline ASM and GNU binutils extensions in our case (because I’m trying to simplify for now!). Instead of a single my_symbol function, we’re going to define two public my_symbol_* functions, with the two ABI version variants, and then alias them:

/* lib2.c */
#include 

void my_symbol_v1(const char *str) {
  printf("Using ABI v.1n");
}

void my_symbol_v2(char *str) {
  printf("Using ABI v.2n");
  str[0] = '';
}

__asm__(".symver my_symbol_v1,my_symbol@");
__asm__(".symver my_symbol_v2,my_symbol@@LIB2");

The two aliases are the ones doing the magic, the first defines the v1 variant as having the default versioning (nothing after the “at” character), while the v2 variant, which is the default used for linking if no version is defined (two “at” characters), has the LIB2 version. This does not work right away though: first of all the LIB2 version is defined nowhere, second the two v1 and v2 symbols are also exported allowing other software direct access to the two variants. Since you cannot use the symbol visibility definitions here (you’d be hiding the aliases too) you have to provide the linker with a linker script:

/* lib2.ldver */
LIB2 {
     local:
    my_symbol_*;
};

With this script you’re telling the linker that all the my_symbol_ variants are to be considered local to the object (hidden) and you are, as well, creating a LIB2 version to assign to the v2 variant.

But how well does it work? Let’s build two binaries against the two libraries, then execute them with the two:

flame@yamato versioning % gcc -fPIC -shared lib1.c -Wl,-soname,libfoo.so.0 -o abi1/libfoo.so.0.1
flame@yamato versioning % gcc -fPIC -shared lib2.c -Wl,-soname,libfoo.so.0 -Wl,-version-script=lib2.ldver -o abi2/libfoo.so.0.2
flame@yamato versioning % gcc test.c -o test-lib1 abi1/libfoo.so.0.1                              
flame@yamato versioning % gcc test.c -o test-lib2 abi2/libfoo.so.0.2                              
flame@yamato versioning % LD_LIBRARY_PATH=abi1 ./test-lib1       
Using ABI v.1
flame@yamato versioning % LD_LIBRARY_PATH=abi1 ./test-lib2
./test-lib2: abi1/libfoo.so.0: no version information available (required by ./test-lib2)
Using ABI v.1
flame@yamato versioning % LD_LIBRARY_PATH=abi2 ./test-lib1
Using ABI v.1
flame@yamato versioning % LD_LIBRARY_PATH=abi2 ./test-lib2
Using ABI v.2

As you see, the behaviour of the program linked against the original version is not changed when moving to the new library, while the opposite is true for the program linked against the new and executed against the old one (that would be forward compatibility of the library, which is rarely guaranteed).

This actually makes it possible to break ABI (but not API) and then be able to fix bugs, it shouldn’t be abused since it does not always work that well and it does not work on all the systems out there. But it helps to deal with legacy software that needs to be kept backward-compatible and yet requires fixes.

For instance, if you didn’t follow the advice of always keeping alloc/free interfaces symmetrical, and you wanted to replace a structure allocation from standard malloc() to g_malloc() to GSlice, you can do that by using something like this:

/* in the header */

my_struct *my_struct_alloc();
#define my_struct_free(x) g_slice_free(my_struct, x)

/* in the library sources */

my_struct *my_struct_alloc_v0() {
  return malloc(sizeof(my_struct));
}

my_struct *my_struct_alloc_v1() {
  return g_malloc(sizeof(my_struct));
}

my_struct *my_struct_alloc_v2() {
  return g_slice_new(my_struct);
}

__asm__(".symver my_struct_alloc_v0,my_struct_alloc@");
__asm__(".symver my_struct_alloc_v1,my_struct_alloc@VER1");
__asm__(".symver my_struct_alloc_v2,my_struct_alloc@@VER2");

Of course this is just a proof of concept, if you only allocated the structure, two macros would be fine; expect something to be done to the newly allocated function then. If versioning wasn’t used, software built against the first version, using malloc(), would crash when freeing the memory area with the GSlice deallocator; on the other hand with this method, each software built against a particular allocator will get its own deallocator properly, even in future releases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s