Why does malloc initialize the values to 0 in gcc?

SHH picture SHH · Nov 6, 2011 · Viewed 43.4k times · Source

Maybe it is different from platform to platform, but

when I compile using gcc and run the code below, I get 0 every time in my ubuntu 11.10.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    double *a = (double*) malloc(sizeof(double)*100)
    printf("%f", *a);
}

Why do malloc behave like this even though there is calloc?

Doesn't it mean that there is an unwanted performance overhead just to initialize the values to 0 even if you don't want it to be sometimes?


EDIT: Oh, my previous example was not initiazling, but happened to use "fresh" block.

What I precisely was looking for was why it initializes it when it allocates a large block:

int main()
{
    int *a = (int*) malloc(sizeof(int)*200000);
    a[10] = 3;
    printf("%d", *(a+10));

    free(a);

    a = (double*) malloc(sizeof(double)*200000);
    printf("%d", *(a+10));
}

OUTPUT: 3
        0 (initialized)

But thanks for pointing out that there is a SECURITY reason when mallocing! (Never thought about it). Sure it has to initialize to zero when allocating fresh block, or the large block.

Answer

Mysticial picture Mysticial · Nov 6, 2011

Short Answer:

It doesn't, it just happens to be zero in your case.
(Also your test case doesn't show that the data is zero. It only shows if one element is zero.)


Long Answer:

When you call malloc(), one of two things will happen:

  1. It recycles memory that was previous allocated and freed from the same process.
  2. It requests new page(s) from the operating system.

In the first case, the memory will contain data leftover from previous allocations. So it won't be zero. This is the usual case when performing small allocations.

In the second case, the memory will be from the OS. This happens when the program runs out of memory - or when you are requesting a very large allocation. (as is the case in your example)

Here's the catch: Memory coming from the OS will be zeroed for security reasons.*

When the OS gives you memory, it could have been freed from a different process. So that memory could contain sensitive information such as a password. So to prevent you reading such data, the OS will zero it before it gives it to you.

*I note that the C standard says nothing about this. This is strictly an OS behavior. So this zeroing may or may not be present on systems where security is not a concern.


To give more of a performance background to this:

As @R. mentions in the comments, this zeroing is why you should always use calloc() instead of malloc() + memset(). calloc() can take advantage of this fact to avoid a separate memset().


On the other hand, this zeroing is sometimes a performance bottleneck. In some numerical applications (such as the out-of-place FFT), you need to allocate a huge chunk of scratch memory. Use it to perform whatever algorithm, then free it.

In these cases, the zeroing is unnecessary and amounts to pure overhead.

The most extreme example I've seen is a 20-second zeroing overhead for a 70-second operation with a 48 GB scratch buffer. (Roughly 30% overhead.) (Granted: the machine did have a lack of memory bandwidth.)

The obvious solution is to simply reuse the memory manually. But that often requires breaking through established interfaces. (especially if it's part of a library routine)