Reputation: 2190
I have a small program, written in C, echo():
/* Read input line and write it back */
void echo() {
char buf[8]; /* Way too small! */
gets(buf);
puts(buf);
}
The corresponding assembly code:
1 echo:
2 pushl %ebp //Save %ebp on stack
3 movl %esp, %ebp
4 pushl %ebx //Save %ebx
5 subl $20, %esp //Allocate 20 bytes on stack
6 leal -12(%ebp), %ebx //Compute buf as %ebp-12
7 movl %ebx, (%esp) //Store buf at top of stack
8 call gets //Call gets
9 movl %ebx, (%esp) //Store buf at top of stack
10 call puts //Call puts
11 addl $20, %esp //Deallocate stack space
12 popl %ebx //Restore %ebx
13 popl %ebp //Restore %ebp
14 ret //Return
I have a few questions.
Why does the %esp allocate 20 bytes? The buf is 8 bytes, why the extra 12?
The return address is right above where we pushed %ebp right? (Assuming we draw the stack upside down, where it grows downward) What is the purpose of the old %ebp (which the current %ebp is pointing at, as a result of line 3)?
If i want to change the return address (by inputting anything more than 12 bytes), it would change where echo() returns to. What is the consequence of changing the old %ebp (aka 4 bytes before the return address)? Is there any possibility of changing the return address or where echo returns to by just changing the old %ebp?
What is the purpose of the %ebp? I know its the frame pointer but, what is that?
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
*c code and assembly copied from Computer Systems - A programmer's Perspective 2nd ed.
** Using gets()
because doing buffer overflows
Upvotes: 4
Views: 637
Reputation: 25855
The reason 20 bytes are allocated is for the purpose of stack alignment. GCC 4.5+ generates code that ensures that the callee's local stack space is aligned to a 16-byte boundary, in order to ensure that compiled code can do aligned SSE loads and stores on the stack in a well-defined manner. For that reason, the compiler in this case needs to throw away some stack-space in order to ensure that gets
/puts
get a properly aligned frame.
In essence, this is how the stack will look, where each line is a 4-byte word except for ---
lines that denote 16-byte address boundaries:
...
Saved EIP from caller
Saved EBP
---
Saved EBX # This is where echo's frame starts
buf
buf
Unused
---
Unused
Parameter to gets/puts
Saved EIP
Saved EBP
---
... # This is where gets'/puts' frame starts
As you can hopefully see from my fantastic ASCII graphics, if it weren't for the "unused" portions, gets
/puts
would get an unaligned frame. Do note also, however, that not 12 bytes are unused; 4 of them are reserved for the parameter.
Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?
Certainly. The compiler is free to organize the stack however it feels like. In order to do buffer overflows predictably, you have to be looking at a specific compiled binary of a program.
As for what the purpose of EBP is (and thus to answer your questions 2, 3 and 5), please see any introductory text to how the call stack is organized, such as the Wikipedia article.
Upvotes: 6