Background

Anyone who has ever programmed UNIX in C probably has used strerror or strerror_r. These functions take a numerical UNIX error code and translate it into a string which can be printed out.

It seems like a pretty simple function, but there is a major gotcha: it isn't thread-safe. The obvious thing to do is to use strerror_r, the re-entrant version of the function mandated by POSIX.

The Problem with strerror_r

However, strerror_r has a different, non-standard prototype in glibc versus in POSIX. The POSIX standard is
int strerror_r(int errnum, char *buf, size_t buflen);

whereas the GNU standard is:
char *strerror_r(int errnum, char *buf, size_t buflen);

Even if you don't define _GNU_SOURCE or any other of the feature macros, you'll still get the non-POSIX definition on Linux.

Ok, you say, no big problem. I don't really expect strerror_r to fail, so I should be able to get by with code like this:

strerror_r(errnum, buf, sizeof(buf)); // do NOT do this!
printf("foobar failed: %s\n", buf);
But, as the comment indicates, you must not do this. The GNU function doesn't usually modify the provided buffer-- it only modifies it sometimes. If you end up with the GNU version, you must use the return value of the function as the string to print.

The Implementation

To figure out why the GNU version's behavior is so strange, we have to take a peek at the implementation.

UNIX errors are small positive integers starting at 1. The obvious thing to do is to use the error number as the index into a static array of strings, and simply return that. This implementation is simple, and fast, and threadsafe.

So why is strerror not threadsafe, then? Well, in glibc at least, if you pass in an unknown error number, you'll get back a message like this:

Unknown error 3542

These error messages clearly can't be stored in an array. They have to be generated dynamically. In strerror, that is done by using a small per-process buffer shared between all threads. In strerror_r, that is done by using the user-supplied buffer.

A Solution

So how do we write strerror_r_improved, the function which does all of this in a cross-platform way? My version would be something like this:

void strerror_r_improved(int err, char *str, size_t str_len)
{
    if (err < sys_nerr)
        snprintf(str, str_len, "%s", sys_errlist[err]);
    else
        snprintf(str, str_len, "Unknown error %d", err);
}
This works on every UNIX that I know of. It exploits an older API that gives you direct access to the static strings which strerror sometimes hands out. As you might guess, sys_nerr is the size of the array, and sys_errlist is the array itself. In my opinion, this is a case where the old, simple API beats the new, over-complicated one hands down.

There are other, more complex solutions involving #ifdefs, but they seem more brittle to me, and also, I don't like "skid marks" in my code.

Hopefully this helps someone. Good luck, and avoid the pitfalls of error handling.