Reputation: 796

can't understand the output of the simple c code about function call in linux

I write a simple code when I try to understand the function call. But I can't understand it's output.

#include <stdio.h>

int* foo(int n)
{
    int *p = &n;
    return p;
}

int f(int m)
{
    int n = 1;
    return 999;
}

int main(int argc, char *argv[])
{
    int num = 1;
    int *p = foo(num);
    int q = f(999);
    printf("[%d]\n[%d]\n", *p, q);
    /* printf("[%d]\n", *q); */
}

Output:

[999]
[999]

Why *p is 999?

Then I modified my code like follows:

#include <stdio.h>

int* foo(int n)
{
    int *p = &n;
    return p;
}

int f()
{
    int n = 1;
    return 999;
}

int main(int argc, char *argv[])
{
    int num = 1;
    int *p = foo(num);
    int q = f();
    printf("[%d]\n[%d]\n", *p, q);
    /* printf("[%d]\n", *q); */
}

Output:

[1]
[999]

Why *p is 1 here? I'm in Linux, using gcc but Clang got the same output.

Upvotes: 3

Answers (6)

sr01853

Reputation: 6121

This Undefined behaviour is due to the involvement of the stack

int *p = foo(num);
int q = f(999);

In the first case, when you say &num, it actually stores the address in the stack where num was stored. Then the foo(num) completes its execution and f(999) comes into action with parameter 999. Since the same stack is used, the same location in stack where num was stored now has parameter 999. And we know that the stack is contiguous.

This is the reason for both printing 999. Actually both tries to print the contents of the same location in the stack.

Whereas in the second case, num is not overwritten since no parameter is passed to f() So, this prints as expected.

Upvotes: 0

junix

Reputation: 3211

Aside the fact that your code is provking undefined behaviour because you are returning a pointer to a stack variable, you were asking for why the behavior changes with changing the signature of f().

The reason why

The reason lies in the way the compiler builds the stackframe for the functions. Assume the compiler is building the stack frame as follows for foo():

Address Contents  
0x199   local variable p
0x200   Saved register A that gets overwritten in this function
0x201   parameter n
0x202   return value
0x203   return address

And for f(int m) the stack looks quiet similar:

Address Contents  
0x199   local variable n
0x200   Saved register A that gets overwritten in this function
0x201   parameter m
0x202   return value
0x203   return address

Now, what happens if you return a pointer to 'n' in foo? The resulting pointer will be 0x201. After returning foo the top of the stack is at 0x204. The memory remains unchanged and you can still read the value '1'. This works until calling another function (in your case 'f'). After calling f, the location 0x201 is overwritten with the value for parameter m.

If you access this location (and you do with your printf statement) it reads '999'. If you had copied the value of this location before invoking f() you would have found the value '1'.

Sticking to our example, the stackframe for f() would look like this as there are no parameters specified:

Address Contents  
0x200   local variable n
0x201   Saved register A that gets overwritten in this function
0x202   return value
0x203   return address

As you are initializing the local variable with '1' you can read '1' at location 0x200 after invoking f(). If you now read the value from location 0x201 you'll get the contents of a saved register.

Some further statements

It is crucial to understand that the above explaination is to show you the methodology why you observe what you observe.
The real behavior depends on the toolchain you are using and the so called calling convetions.
One can easily imagine that it is sometimes hard to predict what will happen. It's a quiet similar situation to accessing memory after freeing it. That's why it's in general unpredictable what happens.
This behavior can even change with changing the optimization level. E.g. i can imagine that if you turn on -O3 for example, the observation will be different because the unused variable n will not appear anymore in the binary.
Having understood the mechanisms behind, it should be understandable why write accesses to the address retrieved from foo could lead to serious problems.

For the brave trying to prove this explaination through experiments

First of all it's important to see that above explaination does not rely on a real stack frame layout. I just introduced the layout in order to have a illustration easy to understand.

If you want to test the behavior on your own machine i suggest you take your favourite debugger and look at the addresses where the local variables and the parameters are placed to see what really happens. Keep in mind: Changing the signature of f changes the information placed on the stack. So the only real "portable" test is changing the parameter for f() and observe the output for the value p points to.

In the case of calling f(void) the information put on the stack differs massively and the value written at the position p is pointing to does not necessarily depend on the parameters or locals anymore. It can also depend on stack variables from the main-function.

On my machine for example the reproduction revealed that the '1' you read in the second variant comes from saving the register that was used to store '1' to "num" as it seems to be used for loading n.

I hope this gives you some insight. Leave a comment if you have further questions. (I know this is somewhat weird to understand)

Upvotes: 4

alok

Reputation: 1340

In your first sample, when you do

int num = 1;
int *p = foo(num);

where foo() is

int* foo(int n)
{
    int *p = &n;
    return p;
}

When variabe num from main() is passed, it is passed by value to foo. In other words, a copy of the variable num, called n, is created on the stack. Both num and n have the same value, but they are different variables and therefore will have different addresses.

When you return p from foo(), the main() gets the value of an address that is different from the address of num delared in main()

The same explanation applies to your modified program.

Let's look at another example to clarify:

int i = 2;

int * foo()
{
return &i;
}

int main() {

i = 1;
int *p = foo();
return 0;

}

In this case, i is declared on the heap, and the same i is referred in both main() and foo(). Same address and same value.

Let's look at a third example:

int i = 2;

int * foo(int i)
{
return &i;
}

int main() {

int i = 1;
int *p = foo(i);
return 0;

}

Here, even though there is a global i, it is hidden by the local variable i in main(), and that is what gets passed to foo(). So, &i returned from foo, ie the value of p in main(), will be different from the address of the variable i declared in main().

Hope this clarifies about variable scope and passing by value,

Upvotes: 0

king_nak

Reputation: 11513

It's not easy without the assembler output, but this is my guess:

Locals and parameters are sotred on the stack. So when calling foo, it will return the address of the first parameter, which is on the stack.

In the first example, you pass a parameter to your second function, which will be also pushed on the stack, exactly where p points to. Therefore it overwrites the value of *p.

In the second example, the stack is not touched in the second call. The old value (of num) remains there.

Upvotes: 0

Dariusz

Reputation: 22271

A local variable, like n in your code here:

int* foo(int n)
{
    int *p = &n;
    return p;
}

"Disappears" as soon as the foo function finishes.

You can not use it, because accessing that variable might give you unpredictable results. You can write something like this, though:

int* foo(int* n)
{
    *n = 999;
    return p;
}

int main(int argc, char *argv[])
{
    int num = 1;
    int *p = foo(&num);
    printf("[%d]\n", *p);
}

because your variable num still exists at the point of printing.

Upvotes: 2

James M

Reputation: 16718

You're invoking undefined behaviour. You can't return the address of a local variable (in this case, the argument int n) and expect it to be useful later.

Upvotes: 2

can&#39;t understand the output of the simple c code about function call in linux

Answers (6)

Related Questions

can't understand the output of the simple c code about function call in linux