Reputation: 459
How does a compiler know if something is allocated on the heap or stack, for instance if I made a variable in a function and returned the address of the variable, the compiler warns me that "function returns address of a local variable":
#include <stdio.h>
int* something() {
int z = 21;
return &z;
}
int main() {
int *d = something();
return 0;
}
I understand why this is a warning because when the function exits, the stack frame is no more and if you have a pointer to that memory and you change it's value you will cause a segmentation fault. What I wonder is how the compiler will know if that variable is allocating memory via. malloc, or how it can tell if it's a local variable on the stack?
Upvotes: 4
Views: 1298
Reputation: 133557
A compiler builds a syntax tree from which it is able to analyze each part of the source code.
It builds a symbol table which associates to each symbol defined some information. This is required for many aspects:
Once you have this symbol table it is quite easy to know if you are trying to return the address of a local variable since you end up having a structure like
ReturnStatement
+ UnaryOperator (&)
+ Identifier (z)
So the compiler can easily check if the identifier is a local stack variable or not.
Mind that this information could in theory propagate along assignments but in practice I don't think many compilers do it, for example if you do
int* something() {
int z = 21;
int* pz = &z;
return pz;
}
The warning goes away. With static code flow analysis you could be able to prove that pz
could only refer to a local variable but in practice that doesn't happen.
Upvotes: 3
Reputation: 11406
What I wonder is how the compiler will know if that variable is allocating memory via. malloc, or how it can tell if it's a local variable on the stack?
The compiler has to analyse all the code and generate machine code from it.
When functions need to be called, the compiler has to push the parameters on the stack (or reserve registers for them), update the stack pointer, look if there are local variables, initialize those on the stack too and update the stack pointer again.
So obviously the compiler knows about local variables being pushed on the stack.
Upvotes: 1
Reputation: 1525
The compiler knows what chunk of memory is holding the current stack. Every time a function is called it creates a new stack and moves the previous frame and stack pointers appropriately which effectively give it a beginning and endpoint for the current stack in memory. Checking to see if you're trying to return a pointer to memory that's about to get freed is relatively simple given that setup.
Upvotes: 1
Reputation: 25409
The example in your question is really easy to figure out.
int* something() {
int z = 21;
return &z;
}
return
statement. It takes the address of the identifier z
.z
is declared. Oh, it is a local variable.Not all cases will be as easy as this one and it's likely that you can trick the compiler into giving false positives or negatives if you write sufficiently weird code.
If you're interested in this kind of stuff, you might enjoy watching some of the talks given at CppCon'15 where static analysis of C++ code was a big deal. Some remarkable talks:
Upvotes: 2