Memory Space Layout / strange memory (stack) behaviour C/ASM?

Question

When playing around with memory to get a better understanding of the process memory layout and the behind the scenes in general I failed to comprehend it entirely. Imagine the following code:

#include 
#include 

int main(int argc,char **argv) {
    char buf[32];
    strcpy(buf,argv[1]);
    return 0;
}

Dump from IDA (dec not hex):

Added

var_30= dword ptr -30h
var_2C= dword ptr -2Ch
var_20= dword ptr -20h
arg_4= dword ptr  0Ch

End

push    ebp
mov     ebp, esp
and     esp, 4294967280
sub     esp, 48
call    sub_401920
mov     eax, [ebp+12]
add     eax, 4
mov     eax, [eax]
mov     [esp+4], eax
lea     eax, [esp+16]
mov     [esp], eax
call    strcpy
mov     eax, 0
leave
retn

My interpretation:

1) Pushes EBP onto the stack
2) Aligns ESP with EBP
3) and esp, 4294967280 compiler pattern can probably be ignored (?)
4) Substract 48 bytes from ESP allocating 48 bytes in size
N) The compiler I used inefficiently allocates memory by blocks of 16 bytes, i.e. if you have a single integer it will allocate 16 bytes, if you go over 16 it will use 32, 48, 64 and so on
5) Call to function related to the compiler pattern in (3) can probably be ignored (?)

Since EBP+0x0-0x3 stores the EBP pointer and EBP+0x4-0x7 the return address we can sort of see what's going on here.

1) Move the pointer argv into EAX
2) Add 4 bytes to EAX (now points to EBP+12+4)
3) Move the pointer EBP+12+4 into EAX
N) EBP+12+4 would be equal to argv[1]
4) Moves the pointer of argv[1] onto the stack ESP+4
5) ???
6) Stores content of buf[32] on ESP+0 (?)

The question, even though, very appreciated if answered, is not so much the ASM but rather:

To my understanding the stack frame of this function should look like this:

[   ] < ESP+0x0-0x3 
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ] < ESP+0x2C-0x2F
[EBP] < EBP+0x0-0x3
[RET] < EBP+0x4-0x7
[ARG]

Where ARG (EBP+0x8+) contains the functions' argument(s).

Confusion

When I used 44 bytes of data A as user input it caused a stack overflow while there is only 4 other bytes for the argv[1] pointer on the stack so where does the single byte come from?
EBP is 4 bytes as it is a pointer however when I used 45 bytes of data entire EBP was already overwritten with A's.
EIP (afaik) is controlled by overwriting EBP+0x3-0x7 (RET) and is also 4 bytes in size. However, 46 bytes of data resulted in EIP being rewritten halfway by A's, 47 bytes 3/4th, and 48 bytes fully overwrote EIP with A's

Finally, is there any reason why when sending a much too big buffer EIP is no longer rewritten? Is that because it starts overwriting the previous stack frame resulting in a much earlier crash course?

user781847 · Accepted Answer

Let's analyze that code

push    ebp
mov     ebp, esp

This is a standard prolog.

and     esp, 0ffffffff0h

The low nibble (4 bits) of ESP are cleared. This, in the worst case, lower the stack pointer by 15 bytes and in the best by nothing.
However this operation align the stack on 16 byte boundaries. This behavior has increased recently (when I started looking at disassembled binaries no compiler aligned the stack) and it is due the increasing use of the SSE and AVX instructions.

sub     esp, 30h

Here the space for local vars is allocated. In theory you have 32 bytes of locals, so 20h bytes. Here the compiler do something really cleaver. It notices that strcpy takes two 4 byte params. So instead of using two push instruction it allocated space for that params directly here. To keep the stack aligned it need to reach a multiple of 16. It cannot simply reserve 28h bytes, it reserves instead 30h bytes. Wasting 8 bytes is not a great loss for the sake of an aligned stack pointer.
So the space allocated is

EBP         <-- Old Frame Pointer (Saved EBP)
...
EBP - 20h   <-- Start of 32 byte array (Up to EBP-01h included)
EBP - 24h   <-- Unused
EBP - 28h   <-- Unused
EBP - 2ch   <-- strcpy source ptr
EBP - 30h   <-- strcpy destination ptr

In this picture I intentionally left out the stack alignment operation at the beginning of the prolog in order to have definite offsets and for the sake of clarity.
Next instruction is

call    sub_401920

Hard to tell without symbols of full disassembly but this is likely the CRT initialization. What is called __main in GCC assembled sources.

main takes two params: argc and argv. The memory layout above EBP is:

...
EBP + 0ch    <-- argv
EBP + 08h    <-- argc
EBP + 04h    <-- Return address
EBP          <-- Previous Frame Pointer (Saved EBP)
EBP - 04h    <-- Locals (Array)
EBP - 08h    <-- Locals (Array)
...

The next instructions just load argv[1]

mov     eax, [ebp+0ch]        ;<-- argv ptr
add     eax, 4                ;<-- &argv[1]
mov     eax, [eax]            ;<-- argv[1]
mov     [esp+4], eax          ;<-- Like a push

Remember that when the last instruction is executed the stack pointer is just below the local area

...
EBP - 20h   <-- Start of 32 byte array (Up to EBP-01h included)
EBP - 24h   <-- Unused
EBP - 28h   <-- Unused
EBP - 2ch   <-- ESP+04h (strcpy source ptr)
EBP - 30h   <-- STACK POINTER (strcpy destination ptr)

The same thing for the destination

lea     eax, [esp+10h]    ;Pointer to ebp-20h (EAX = ebp-20h)
mov     [esp], eax        ;Like a push
call    strcpy

And finally the standard epilog

mov     eax, 0
leave
retn

Now it's time to get the full picture with the stack alignment. Aligning the stack lower the stack pointer after the EBP register has been saved. Referencing local vars is done trough ESP mostly.

      +---------+
      |  argv   |  EBP + 0ch
      +---------+
      |  argc   |  EBP + 08h
      +---------+
      | ret adr |  EBP + 04h
      +---------+
EBP ->| Old EBP |  EBP
      +---------+
      | Unused  |  EBP - 04h    \
          ...                    > Variable length (min: 0, max = 0fh)
      | Unused  |  ESP + 30h    /
      +---------+
      |  Array  |  
          ...
      |  Array  |  ESP + 10h 
      +---------+
      | Unused  |  ESP + 0ch
      +---------+
      | Unused  |  ESP + 08h
      +---------+
      | src ptr |  ESP + 04h
      +---------+
ESP ->| dst ptr |  ESP
      +---------+

Regarding your final questions, it is not possible to answer deterministically.
If the compiler were not aligning the stack the answers would be:

When you input 44 bytes, you stard writing it at EBP-20h so there are 12 exceeding bytes. You overwrite first the old frame pointer, then the return address and then the argc value.
EBP is 4 bytes because it is 32 bit register. With 45 bytes you overwrite the old frame pointer (EBP) saved on the stack. See above.
You start overwriting the return address with 37 bytes of data (requiring 40 byte to fully overwrite).

However by aligning the stack pointer you are effectively lowering the ESP by a variable (in theory) amount of data and so the number above must be added with a variable number between 0 and 15. So for example it seem that the aligning in your case lowered the stack pointer by 7 bytes in the last question.

Memory Space Layout / strange memory (stack) behaviour C/ASM?

Answers (1)

Related Questions