Reputation: 365

Memory Space Layout / strange memory (stack) behaviour C/ASM?

When playing around with memory to get a better understanding of the process memory layout and the behind the scenes in general I failed to comprehend it entirely. Imagine the following code:

#include <stdio.h>
#include <string.h>

int main(int argc,char **argv) {
    char buf[32];
    return 0;

Dump from IDA (dec not hex):


var_30= dword ptr -30h
var_2C= dword ptr -2Ch
var_20= dword ptr -20h
arg_4= dword ptr  0Ch


push    ebp
mov     ebp, esp
and     esp, 4294967280
sub     esp, 48
call    sub_401920
mov     eax, [ebp+12]
add     eax, 4
mov     eax, [eax]
mov     [esp+4], eax
lea     eax, [esp+16]
mov     [esp], eax
call    strcpy
mov     eax, 0

My interpretation:

Since EBP+0x0-0x3 stores the EBP pointer and EBP+0x4-0x7 the return address we can sort of see what's going on here.

The question, even though, very appreciated if answered, is not so much the ASM but rather:

To my understanding the stack frame of this function should look like this:

[   ] < ESP+0x0-0x3 
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ]
[   ] < ESP+0x2C-0x2F
[EBP] < EBP+0x0-0x3
[RET] < EBP+0x4-0x7

Where ARG (EBP+0x8+) contains the functions' argument(s).


Finally, is there any reason why when sending a much too big buffer EIP is no longer rewritten? Is that because it starts overwriting the previous stack frame resulting in a much earlier crash course?

Upvotes: 0

Views: 597

Answers (1)



Let's analyze that code

push    ebp
mov     ebp, esp

This is a standard prolog.

and     esp, 0ffffffff0h 

The low nibble (4 bits) of ESP are cleared. This, in the worst case, lower the stack pointer by 15 bytes and in the best by nothing.
However this operation align the stack on 16 byte boundaries. This behavior has increased recently (when I started looking at disassembled binaries no compiler aligned the stack) and it is due the increasing use of the SSE and AVX instructions.

sub     esp, 30h

Here the space for local vars is allocated. In theory you have 32 bytes of locals, so 20h bytes. Here the compiler do something really cleaver. It notices that strcpy takes two 4 byte params. So instead of using two push instruction it allocated space for that params directly here. To keep the stack aligned it need to reach a multiple of 16. It cannot simply reserve 28h bytes, it reserves instead 30h bytes. Wasting 8 bytes is not a great loss for the sake of an aligned stack pointer.
So the space allocated is

EBP         <-- Old Frame Pointer (Saved EBP)
EBP - 20h   <-- Start of 32 byte array (Up to EBP-01h included)
EBP - 24h   <-- Unused
EBP - 28h   <-- Unused
EBP - 2ch   <-- strcpy source ptr
EBP - 30h   <-- strcpy destination ptr

In this picture I intentionally left out the stack alignment operation at the beginning of the prolog in order to have definite offsets and for the sake of clarity.
Next instruction is

call    sub_401920

Hard to tell without symbols of full disassembly but this is likely the CRT initialization. What is called __main in GCC assembled sources.

main takes two params: argc and argv. The memory layout above EBP is:

EBP + 0ch    <-- argv
EBP + 08h    <-- argc
EBP + 04h    <-- Return address
EBP          <-- Previous Frame Pointer (Saved EBP)
EBP - 04h    <-- Locals (Array)
EBP - 08h    <-- Locals (Array)

The next instructions just load argv[1]

mov     eax, [ebp+0ch]        ;<-- argv ptr
add     eax, 4                ;<-- &argv[1]
mov     eax, [eax]            ;<-- argv[1]
mov     [esp+4], eax          ;<-- Like a push

Remember that when the last instruction is executed the stack pointer is just below the local area

EBP - 20h   <-- Start of 32 byte array (Up to EBP-01h included)
EBP - 24h   <-- Unused
EBP - 28h   <-- Unused
EBP - 2ch   <-- ESP+04h (strcpy source ptr)
EBP - 30h   <-- STACK POINTER (strcpy destination ptr)

The same thing for the destination

lea     eax, [esp+10h]    ;Pointer to ebp-20h (EAX = ebp-20h)
mov     [esp], eax        ;Like a push
call    strcpy

And finally the standard epilog

mov     eax, 0

Now it's time to get the full picture with the stack alignment. Aligning the stack lower the stack pointer after the EBP register has been saved. Referencing local vars is done trough ESP mostly.

      |  argv   |  EBP + 0ch
      |  argc   |  EBP + 08h
      | ret adr |  EBP + 04h
EBP ->| Old EBP |  EBP
      | Unused  |  EBP - 04h    \
          ...                    > Variable length (min: 0, max = 0fh)
      | Unused  |  ESP + 30h    /
      |  Array  |  
      |  Array  |  ESP + 10h 
      | Unused  |  ESP + 0ch
      | Unused  |  ESP + 08h
      | src ptr |  ESP + 04h
ESP ->| dst ptr |  ESP

Regarding your final questions, it is not possible to answer deterministically.
If the compiler were not aligning the stack the answers would be:

  1. When you input 44 bytes, you stard writing it at EBP-20h so there are 12 exceeding bytes. You overwrite first the old frame pointer, then the return address and then the argc value.

  2. EBP is 4 bytes because it is 32 bit register. With 45 bytes you overwrite the old frame pointer (EBP) saved on the stack. See above.

  3. You start overwriting the return address with 37 bytes of data (requiring 40 byte to fully overwrite).

However by aligning the stack pointer you are effectively lowering the ESP by a variable (in theory) amount of data and so the number above must be added with a variable number between 0 and 15. So for example it seem that the aligning in your case lowered the stack pointer by 7 bytes in the last question.

Upvotes: 3

Related Questions