This Time Self-Hosted
dark mode light mode Search

Stating that something is harmful, sometimes is harmful

Pierre, albeit it certainly is annoying to find silly vulnerabilities in software, and trust me I seen very shitty code since I joined Gentoo, and no I’m not referring to Brian’s obfuscated Python 😉 (sorry Brian, you know I trust your code, but I can’t read it!), stating that strcpy() is always wrong is not a good idea.

As like any other function, it has its uses, limited, but there are. Of course functions like strcpy() are commonly misused, but then it’s a mistake on the user part, rather than on the existence of the function itself. For instance the following code is safe, and the use of strncpy() would make it slower:

struct token_s {
  char token_string[10];
};

void set_token_string(struct token_s *tok, int idx) {
  static const char deftokens[][] = {
    "foo", "bar, "ciao"
  };
  if ( idx >= sizeof(deftokens) / sizeof(deftokens[0]) ) {
    tok->token_string[0] = '';
    return;
  }
  strcpy(tok->token_string, deftokens[idx]);
}Code language: C++ (cpp)

Of course, beside the code being totally abstract and not part of an actual solution, attention has to be paid when you change token_s or when you change deftokens, but then, strcpy is of safe use as long as you make sure that the token structure has enough space to accept all the tokens that could be inserted. And no, it’s not always possible to use strdup, when you are crafting a padded structure that has to be saved to file or sent via network, you most likely want the string inline, and with a fixed size, most likely zeroed out for the remaining part. The token example can easily be applied to a protocol, too.

Why using strncpy in this situation is not good? Well the answer is in the way it is usually implemented: the n-variant checks both the current character not to be NULL ('\0') and that the index is not higher than the provided parameter; two checks where one might be good enough, repeated on a long string, might be quite a waste of performance.

Let’s ignore the problems with strncpy versus strlcpy, I know enough people unsure whether to use the size of the destination buffer versus the size minus one.

And what about the following code, that can easily be written if you tell people not to use strcpy, and use strncpy instead, or strndup rather than strdup:

size_t size = strlen(str);
char *dst = malloc(size);
strncpy(dst, str, size);Code language: C++ (cpp)

What is wrong in that? well, if str is an user-supplied string, and you didn’t handle it correctly, it might not be NULL-terminated, so strlen might return the wrong value, then you’d be calling malloc with an improper value.

When does the difference between strcpy and strncpy become important? Well, strcpy is fine when you are using it on static strings, or strings that you know are properly NULL-terminated, either because you loaded them carefully or because you created them yourself.

For instance, when you load for instance a 256 bytes buffer, you can either always use the n-variant of the functions (which executes two tests for every iteration), or you can make the buffer 257 bytes long, and put a safe '\0' at the end of it. Then you don’t really need to worry about the n-variants: even if there is no NULL in the buffer, there will be one at the end to protect you from going over the size. If you are to suppose that the string you’re loading, for instance from a binary file, is at maximum 255 bytes long and terminated by NULL, you don’t even need to make the buffer 257 bytes long, you just need to make sure that str[255] is set to '\0' by forcing it: if it was already no problem, if the string was shorter, no problem, if the string was longer and that wasn’t allowed, you covered your ass.

On the other hand, strncpy has to be used when you are not sure about the size of the source string, as that is where it is totally unsafe to use strcpy. Other n-variants of string functions can be used more for performance rather than safety, for instance, strnlen (which I doubted the use of an year and a half ago) can be used if you need to get the first part of a string, that might be quite long; it’s something like the following code:

size_t length = strlen(foo);
if ( length > MAXSIZE ) length = MAXSIZE;Code language: C++ (cpp)

Although this might be faster than using strnlen for a short string (just one check during iteration, and one afterward), if MAXSIZE is for instance 10, and the length of the string can be 50K, having two checks will stop the strnlen call after 10 iterations, rather than having to run through 50K iterations to get the full length of the string.

Anyway, the bottomline of this post is that there is no perfect function, every function has to be used in the proper context and considering its limitation and strengths. For sure strcpy shouldn’t be just killed off because it can easily be misused and add security issues to a software, we just need to make sure that the important software gets audited, be it by professionals, distributors, or simply power users. In the hands of a good programmer, strcpy can even be a noticeable performance improvement compared to strncpy; of course in the hand of a poor programmer, or used in a poorly designed software, it can be a risk. If the software is xine, it will be a risk, considering the way it’s ““designed””.

Comments 1
  1. Actually I could not agree more with that :)Of course using strcpy when you have the complete control on the arguments is a good idea. What I wanted to say is strcpy is quite always misused, so unless you really know what you do, just use strncpy.One could argue that someone who develops in C should always know what he’s doing, but that’s another debate.Oh and for the record, my first name is Pierre-Yves, not just Pierre 😉

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.