My fascinating question for today: What should malloc(0) return?
Composing a critique of somebody else's crappy half-baked stdlib implementation today caused me to take another look at my own crappy half-baked stdlib implementation. In part, my critique puts the memory allocation functions malloc(), free(), calloc(), and realloc() under the microscope. One of the four is easy: calloc() is really just malloc() in disguise -- the only wrinkle is to remember to call memset() afterwards, which the somebody else in question forgot :-( --, but the other three engage in an interesting dance.
The C99 standard has the following to say about malloc(0) and friends, and the text in C90 is similar:
If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
I've always preferred the latter behaviour for malloc(): it avoids overloading the meaning of a NULL return, so, especially if your size was a variable rather than a constant, you can say "if malloc returned NULL, then abort due to out-of-memory" without having to fudge about and check that it wasn't really a returned-NULL-due-to-zero-size situation. (Of course, you could say that people who actually call malloc() with a size of zero deserve what they get!)
That (the latter) is also the behaviour required of a C++ allocation function. So if you're writing operator new() in terms of malloc(), you have to know that your malloc() has the same preference as I do, or take extra care instead of just calling malloc().
Now let's look at some of the other requirements that C90 and C99 place on these functions:
free (NULL) = ({}) realloc (NULL, n) = malloc (n) realloc (p, 0) = ({ free (p); return NULL; }) when p != NULL
The first is just saying that free(NULL) does nothing, which is, of course, very convenient -- you get to avoid some tedious checking. The other two are parts of the definition of realloc(). They are spelt out in the C90 text; the C99 text as pertains to the last one is quite different, but it can still be derived from the text.
Now, the "when p != NULL" condition on the last rule is not very pleasant: it makes for an ugly algebraic rule. Futhermore, I would contend that programmers who actually write code that's literally "realloc (p, 0)", with p variable but a constant 0, expect realloc() to Do The Right Thing even when p is NULL, just as free() does. Thus I would contend that programmers actually expect that they are using a stronger set of rules, in which the third rule holds regardless of the value of p. (In particular, this is what the Linux man page says realloc() does; yes, the man page is way stronger here than the C Standard!)
In that case, we can calculate the value of malloc(0):
malloc (0) = realloc (NULL, 0) = ({ free (NULL); return NULL; }) = NULL
So that means that the choice is really between:
- either a realloc() that Does The Right Thing for size=0
- or a malloc() that does what I've always preferred for size=0
In retrospect, this was all obvious. The specification of implementation-defined behaviour that I quoted at the beginning applies to both malloc() and realloc(). Wanting realloc(p, 0) to be useful in cleanup functions (by just freeing through p, and not also allocating some dummy "zero-sized" object that the caller will ignore and will become garbage) means that you've already made the choice: realloc() given a size of zero will return a null pointer. Either that means you've also already made the choice for malloc() too; or you've decided to make the choice differently for the two functions, which would be really ugly and I'm not even sure is allowed by the Standard.
But playing with the algebra was fun while it lasted!
(Looks like I'm going to have to put that "malloc (size? size : 1)" stuff back in my library's default C++ allocation functions.)