Parameters handling and COWs.

One common problem while fixing copy on write pages is finding an array of structures that hold parameters name and their value, usually loaded up either from network or from configuration files. One of the classical declarations of those structures is:

struct {
    char name[16];
    int value;
} parameters[20] = {
  { "foo", 34 },
  { "bar", 50 }

I put a character array in the struct so to show that using them does not make everything magically working.

What is wrong with this structure? Well it is not constant, so it gets written to .data, which is a COW section. You can’t really make it constant either as the value of the parameters has to change, if you’re using it to set the parameters for the program to work with. As long as the user is not going to change the default parameters, it’s not going to use extra memory, luckily, but if the user changes even just one parameter, COW is triggered.

This is actually quite common, it’s even used by Xorg drivers like radeon, often with extra flags like type, and unions, and so on.

Most people would react to this by using pointers to strings so that those get to .rodata and just their pointer is added to the parameters object. In the best case, those pointer will end up in .data, when using PIC, they’ll be added to .data.rel; the difference between the two is, again, that the first is only copied over when changing the values from the default, while the latter is copied over every time the executable or shared library is loaded (unless using prelink, if that works at all actually).

My proposed solution to that is simply to use two different objects. Somehow one limitation of C shows off here: being unable to write in columns 😉

static const char param_names[16][20] = { "foo", "bar" };
char param_values[20] = { 34, 50 };

If there is more to the names, one might use a structure instead, think for instance of this example, where the parameter can be either an integer, a float or a string:

static const struct {
    char name[16];
} parameters[20] = {
  { "foo", PARAM_INT },
  { "bar", PARAM_FLOAT },
  { "baz", PARAM_STRING }

union {
    int intval;
    float floatval;
    char *stringval;
} param_values[20] = {
  { .intval = 34 },
  { .floatval = 50.0 },
  { .stringval = "default" }

Now in these last two examples the metadata about parameters (name, and type) can be written off to .rodata, while the values are saved in .data; at the same time, saved for the last one with the string, the parameter values will be shared as long as the defaults are not changed. Nice uh?

The only thing to remember is to make sure the index of the two arrays is the same. This is why I think it’s a pity that there is no way to write in columns in C, it would make maintaining this code quite easier.

When the list of parameters to maintain is quite big, the best way to tackle down the problem is to write a script that translate a table of metadata and default values into C code. It should be quite easy to do with most tools like Perl or Ruby or AWK. This way there is little to no overhead in maintaining the two arrays separate, and the memory use will be optimised.

As for the reason why I used an union in the last example, I have seen more than once structures defined the same way of that struct, carrying all three kinds at once. Those structures tend to waste a lot of space. Once you’re certain that only one of the three entries is used for a given parameter, using an union also limit the amount of memory the table use. This is one thing unions have been designed for, after all.