Reputation: 1039
malloc is not guaranteed to return 0'ed memory. The conventional wisdom is not only that, but that the contents of the memory malloc returns are actually non-deterministic, e.g. openssl used them for extra randomness.
However, as far as I know, malloc is built on top of brk/sbrk, which do "return" 0'ed memory. I can see why the contents of what malloc returns may be non-0, e.g. from previously free'd memory, but why would they be non-deterministic in "normal" single-threaded software?
Edit Several people answered explaining why the memory can be non-0, which I already explained in the question above. What I'm asking is why the program using the contents of what malloc returns may be non-deterministic, i.e. why it could have different behavior every time it's run (assuming the same binary and libraries). Non-deterministic behavior is not implied by non-0's. To put it differently: why it could have different contents every time the binary is run.
Upvotes: 9
Views: 1699
Reputation: 78953
I think that the assumption that it is non-deterministic is plain wrong, particularly as you ask for a non-threaded context. (In a threaded context due to scheduling alea you could have some non-determinism).
Just try it out. Create a sequential, deterministic application that
run this program twice and register the result in two different files. My idea is that these files will be identical.
Upvotes: 2
Reputation: 1854
There is an whole ecosystem of programs living inside a computer memmory and you cannot control the order in which mallocs and frees are happening.
Imagine that the first time you run your application and malloc() something, it gives you an address with some garbage. Then your program shuts down, your OS marks that area as free. Another program takes it with another malloc(), writes a lot of stuff and then leaves. You run your program again, it might happen that malloc() gives you the same address, but now there's different garbage there, that the previous program might have written.
I don't actually know the implementation of malloc() in any system and I don't know if it implements any kind of security measure (like randomizing the returned address), but I don't think so.
It is very deterministic.
Upvotes: -1
Reputation: 715
The simplest way I can think of putting the answer is like this:
If I am looking for wall space to paint a mural, I don't care whether it is white or covered with old graffiti, since I'm going to prime it and paint over it. I only care whether I have enough square footage to accommodate the picture, and I care that I'm not painting over an area that belongs to someone else.
That is how malloc thinks. Zeroing memory every time a process ends would be wasted computational effort. It would be like re-priming the wall every time you finish painting.
Upvotes: 0
Reputation: 210725
Malloc does not guarantee unpredictability... it just doesn't guarantee predictability.
E.g. Consider that
return 0;
Is a valid implementation of malloc.
Upvotes: 11
Reputation: 70206
Memory returned by malloc
is not zeroed (or rather, is not guaranteed to be zeroed) because it does not need to. There is no security risk in reusing uninitialized memory pulled from your own process' address space or page pool. You already know it's there, and you already know the contents. There is also no issue with the contents in a practical sense, because you're going to overwrite it anyway.
Incidentially, the memory returned by malloc
is zeroed upon first allocation, because an operating system kernel cannot afford the risk of giving one process data that another process owned previously. Therefore, when the OS faults in a new page, it only ever provides one that has been zeroed. However, this is totally unrelated to malloc
.
(Slightly off-topic: The Debian security thing you mentioned had a few more implications than using uninitialized memory for randomness. A packager who was not familiar with the inner workings of the code and did not know the precise implications patched out a couple of places that Valgrind had reported, presumably with good intent but to desastrous effect. Among these was the "random from uninitilized memory", but it was by far not the most severe one.)
Upvotes: 3
Reputation: 373052
The initial values of memory returned by malloc
are unspecified, which means that the specifications of the C and C++ languages put no restrictions on what values can be handed back. This makes the language easier to implement on a variety of platforms. While it might be true that in Linux malloc
is implemented with brk
and sbrk
and the memory should be zeroed (I'm not even sure that this is necessarily true, by the way), on other platforms, perhaps an embedded platform, there's no reason that this would have to be the case. For example, an embedded device might not want to zero the memory, since doing so costs CPU cycles and thus power and time. Also, in the interest of efficiency, for example, the memory allocator could recycle blocks that had previously been freed without zeroing them out first. This means that even if the memory from the OS is initially zeroed out, the memory from malloc
needn't be.
The conventional wisdom that the values are nondeterministic is probably a good one because it forces you to realize that any memory you get back might have garbage data in it that could crash your program. That said, you should not assume that the values are truly random. You should, however, realize that the values handed back are not magically going to be what you want. You are responsible for setting them up correctly. Assuming the values are truly random is a Really Bad Idea, since there is nothing at all to suggest that they would be.
If you want memory that is guaranteed to be zeroed out, use calloc
instead.
Hope this helps!
Upvotes: 4
Reputation: 5314
There is no guarantee that brk
/sbrk
return 0ed-out data; this is an implementation detail. It is generally a good idea for an OS to do that to reduce the possibility that sensitive information from one process finds its way into another process, but nothing in the specification says that it will be the case.
Also, the fact that malloc
is implemented on top of brk
/sbrk
is also implementation-dependent, and can even vary based on the size of the allocation; for example, large allocations on Linux have traditionally used mmap
on /dev/zero instead.
Basically, you can neither rely on malloc()
ed regions containing garbage nor on it being all-0, and no program should assume one way or the other about it.
Upvotes: 1
Reputation: 726929
malloc
is defined on many systems that can be programmed in C/C++, including many non-UNIX systems, and many systems that lack operating system altogether. Requiring malloc
to zero out the memory goes against C's philosophy of saving CPU as much as possible.
The standard provides a zeroing cal calloc
that can be used if you need to zero out the memory. But in cases when you are planning to initialize the memory yourself as soon as you get it, the CPU cycles spent making sure the block is zeroed out are a waste; C standard aims to avoid this waste as much as possible, often at the expense of predictability.
Upvotes: 3
Reputation: 18974
Even single-threaded code may do malloc then free then malloc and get back previously used, non-zero memory.
Upvotes: 1
Reputation: 375854
Even in "normal" single-threaded programs, memory is freed and reallocated many times. Malloc will return to you memory that you had used before.
Upvotes: 1