8 Sep 2002 johnm   » (Journeyer)

My fascinating question for today: What should malloc(0) return?

Composing a critique of somebody else's crappy half-baked stdlib implementation today caused me to take another look at my own crappy half-baked stdlib implementation. In part, my critique puts the memory allocation functions malloc(), free(), calloc(), and realloc() under the microscope. One of the four is easy: calloc() is really just malloc() in disguise -- the only wrinkle is to remember to call memset() afterwards, which the somebody else in question forgot :-( --, but the other three engage in an interesting dance.

The C99 standard has the following to say about malloc(0) and friends, and the text in C90 is similar:

If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

I've always preferred the latter behaviour for malloc(): it avoids overloading the meaning of a NULL return, so, especially if your size was a variable rather than a constant, you can say "if malloc returned NULL, then abort due to out-of-memory" without having to fudge about and check that it wasn't really a returned-NULL-due-to-zero-size situation. (Of course, you could say that people who actually call malloc() with a size of zero deserve what they get!)

That (the latter) is also the behaviour required of a C++ allocation function. So if you're writing operator new() in terms of malloc(), you have to know that your malloc() has the same preference as I do, or take extra care instead of just calling malloc().

Now let's look at some of the other requirements that C90 and C99 place on these functions:

   free (NULL)    = ({})
realloc (NULL, n) = malloc (n)
realloc (p, 0)    = ({ free (p); return NULL; })    when p != NULL

The first is just saying that free(NULL) does nothing, which is, of course, very convenient -- you get to avoid some tedious checking. The other two are parts of the definition of realloc(). They are spelt out in the C90 text; the C99 text as pertains to the last one is quite different, but it can still be derived from the text.

Now, the "when p != NULL" condition on the last rule is not very pleasant: it makes for an ugly algebraic rule. Futhermore, I would contend that programmers who actually write code that's literally "realloc (p, 0)", with p variable but a constant 0, expect realloc() to Do The Right Thing even when p is NULL, just as free() does. Thus I would contend that programmers actually expect that they are using a stronger set of rules, in which the third rule holds regardless of the value of p. (In particular, this is what the Linux man page says realloc() does; yes, the man page is way stronger here than the C Standard!)

In that case, we can calculate the value of malloc(0):

malloc (0) = realloc (NULL, 0)
           = ({ free (NULL); return NULL; })
           = NULL

So that means that the choice is really between:

  • either a realloc() that Does The Right Thing for size=0
  • or a malloc() that does what I've always preferred for size=0
You can't have both! So I've changed my mind: being able to write realloc(p, 0) and have it Just Work, like free(p) does, is more important to me than having malloc(0), which I never actually use, be easily distinguishable from memory exhaustion.

In retrospect, this was all obvious. The specification of implementation-defined behaviour that I quoted at the beginning applies to both malloc() and realloc(). Wanting realloc(p, 0) to be useful in cleanup functions (by just freeing through p, and not also allocating some dummy "zero-sized" object that the caller will ignore and will become garbage) means that you've already made the choice: realloc() given a size of zero will return a null pointer. Either that means you've also already made the choice for malloc() too; or you've decided to make the choice differently for the two functions, which would be really ugly and I'm not even sure is allowed by the Standard.

But playing with the algebra was fun while it lasted!

(Looks like I'm going to have to put that "malloc (size? size : 1)" stuff back in my library's default C++ allocation functions.)

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!