narayanpatra
narayanpatra

Reputation: 5691

Can anybody explain some simple assembly code?

I have just started to learn assembly. This is the dump from gdb for a simple program which prints hello ranjit.

Dump of assembler code for function main:
   0x080483b4 <+0>: push   %ebp
   0x080483b5 <+1>: mov    %esp,%ebp
   0x080483b7 <+3>: sub    $0x4,%esp
=> 0x080483ba <+6>: movl   $0x8048490,(%esp)
   0x080483c1 <+13>:    call   0x80482f0 <puts@plt>
   0x080483c6 <+18>:    leave  
   0x080483c7 <+19>:    ret    

My questions are :

  1. Why every time ebp is pushed on to stack at start of the program? What is in the ebp which is necessary to run this program?
  2. In second line why is ebp copied to esp?
  3. I can't get the third line at all. what I know about SUB syntax is "sub dest,source", but here how can esp be subtracted from 4 and stored in 4?
  4. What is this value "$0x8048490"? Why it is moved to esp, and why this time is esp closed in brackets? Does it denote something different than esp without brackets?
  5. Next line is the call to function but what is this "0x80482f0"?
  6. What is leave and ret (maybe ret means returning to lib c.)?

operating system : ubuntu 10, compiler : gcc

Upvotes: 1

Views: 1436

Answers (2)

Emile
Emile

Reputation: 271

In my career I learned several assembly languages, you didn't mention which but it appears Intel x86 (segmented memory model as PaxDiablo pointed out). However, I have not used assembly since last century (lucky me!). Here are some of your answers:

  1. The EBP register is pushed onto the stack at the beginning because we need it further along in other operations of the routine. You don't want to just discard its original value thus corrupting the integrity of the rest of the application.
  2. If I remember correctly (I may be wrong, long time) it is the other way around, we are moving %esp INTO %ebp, remember we saved it in the previous line? now we are storing some new value without destroying the original one.
  3. Actually they are SUBstracting the value of four (4) FROM the contents of the %esp register. The resulting value is not stored on "four" but on %esp. If %esp had 0xFFF8 after the SUB it will contain 0xFFF4. I think this is called "Immediate" if my memory serves me. What is happening here (I reckon) is the computation of a memory address (4 bytes less).
  4. The value $0x8048490 I don't know. However, it is NOT being moved INTO %esp but rather INTO THE ADDRESS POINTED TO BY THE CONTENTS OF %esp. That is why the notation is (%esp) rather than %esp. This is kind of a common notation in all assembly languages I came about in my career. If on the other hand the right operand was simply %esp, then the value would have been moved INTO the %esp register. Basically the %esp register's contents are being used for addressing.
  5. It is a fixed value and the string on the right makes me think that this value is actually the address of the puts() (Put String) compiler library routine.
  6. "leave" is an instrution that is the equivalent of "pop %ebp". Remember we saved the contents of %ebp at the beginning, now that we are done with the routine we are restoring it back into the register so that the caller gets back to its context. The "ret" instruction is the final instruction of the routine, it "returns" to the caller.

Upvotes: 0

paxdiablo
paxdiablo

Reputation: 882566

ebp is used as a frame pointer in Intel processors (assuming you're using a calling convention that uses frames).

It provides a known point of reference for locating passed-in parameters (on one side) and local variables (on the other) no matter what you do with the stack pointer while your function is active.

The sequence:

push   %ebp       ; save callers frame pointer
mov    %esp,%ebp  ; create a new frame pointer
sub    $N,%esp    ; make space for locals

saves the frame pointer for the previous stack frame (the caller), loads up a new frame pointer, then adjusts the stack to hold things for the current "stack level".

Since parameters would have been pushed before setting up the frame, they can be accessed with [bp+N] where N is a suitable offset.

Similarly, because locals are created "under" the frame pointer, they can be accessed with [bp-N].

The leave instruction is a single one which undoes that stack frame. You used to have to do it manually but Intel introduced a faster way of getting it done. It's functionally equivalent to:

mov  %ebp, %esp   ; restore the old stack pointer
pop  %ebp         ; and frame pointer

(the old, manual way).

Answering the questions one by one in case I've missed something:

  1. To start a new frame. See above.

  2. It isn't. esp is copied to ebp. This is AT&T notation (the %reg is a dead giveaway) where (among other thing) source and destination operands are swapped relative to Intel notation.

  3. See answer to (2) above. You're subtracting 4 from esp, not the other way around.

  4. It's a parameter being passed to the function at 0x80482f0. It's not being loaded into esp but into the memory pointed at by esp. In other words, it's being pushed on the stack. Since the function being called is puts (see (5) below), it will be the address of the string you want putsed.

  5. The function name in the <> after the address. It's calling the puts function (probably the one in the standard library though that's not guaranteed). For a description of what the PLT is, see here.

  6. I've already explained leave above as unwinding the current stack frame before exiting. The ret simply returns from the current function. If the current functtion is main, it's going back to the C startup code.

Upvotes: 5

Related Questions