Big_t boy
Big_t boy

Reputation: 329

How to find the starting value in a for loop in assembly?

I am having trouble to understand what the assembly code below does as I convert it to C. I know it is a loop, but I don't know where to start in converting it.

I kinda understand that the input has to be 6 numbers, and that inside the loop it will add 5 and compare.

I'm mostly stuck on how we know the starting value?

   0x0000000000400f15 <+9>:     callq  0x4016e5 <read_six_numbers>
   0x0000000000400f1a <+14>:    lea    0x4(%rsp),%rbx
   0x0000000000400f1f <+19>:    lea    0x18(%rsp),%rbp
   0x0000000000400f24 <+24>:    mov    -0x4(%rbx),%eax
   0x0000000000400f27 <+27>:    add    $0x5,%eax
   0x0000000000400f2a <+30>:    cmp    %eax,(%rbx)
   0x0000000000400f2c <+32>:    je     0x400f33 <phase_2+39>
   0x0000000000400f2e <+34>:    callq  0x4016c3 <explode_bomb>
   0x0000000000400f33 <+39>:    add    $0x4,%rbx
   0x0000000000400f37 <+43>:    cmp    %rbp,%rbx
   0x0000000000400f3a <+46>:    jne    0x400f24 <phase_2+24>

Upvotes: 3

Views: 1200

Answers (2)

dho
dho

Reputation: 2370

There are a few things to consider here that aren't specified in your question (ABI, processor architecture, executable file format, etc). Not all of these are necessary to answer your question, but it's likely that understanding this will improve your general understanding of how functions, methods, or procedures are called in a wide range of executable contexts.

ABI

In particular, different CPU architectures, operating systems, and even executable binary formats may have different signatures for handling program inputs. Because it is apparent that you are using an AMD64-architecture CPU, you may find this wikipedia page useful. In particular, it seems you are using the "System V x86-64 ABI" based on some context in your snippet. (We'll do a full analysis of your snippet later.)

Stacks

The C programming language does have any concept of a stack, so although that is relevant in terms of your snippet, it is not a requirement for C programs and it is likely that a portable version of your program may not use the stack at all. Indeed, although introductory compiler courses still seem to tend to use a stack to pass state between call frames, the stack is not generally used in the SysV ABI on AMD64.

(It was much more common to do this on x86 as the 32-bit architecture is register-constrained. The overhead of using registers to pass state on such an architecture is likely to be higher as it's likely the registers will need to be copied to the stack so that they can be reused and because it's likely additional function calls will need them preserved.)

Your Snippet

The SysV ABI in particular uses %rdi, %rsi, %rdx, %rcx, %r8, %r9, and %xmm0-7, in that order.

0x0000000000400f0c <+0>:     push   %rbp
0x0000000000400f0d <+1>:     push   %rbx

This preserves the stack frame of the caller by pushing the registers representing the stack frame to the top of the stack. %rbp and %rbx are "callee-save" registers, which means that the function called must preserve their value as the caller needs their values to preserve its state.

0x0000000000400f0e <+2>:     sub    $0x28,%rsp

This allocates 40 bytes of space on the stack. Why 40 bytes? We've already pushed 16 bytes onto the stack preserving %rbp and %rbx. We need an additional 24 bytes for our scratch space for read_six_numbers, so 16 + 24 == 40.

0x0000000000400f12 <+6>:     mov    %rsp,%rsi

This moves the base address of the stack into %rsi. Now, because I'm assuming SysV ABI, this means that the address is actually the second argument to the function we're about to call. The contents of this space are undefined and are likely to be random values. This is scratch space used by read_six_numbers.

0x0000000000400f15 <+9>:     callq  0x4016e5 <read_six_numbers>

This calls the function read_six_numbers. Since our scratch space is the second argument (as per SysV ABI), this means our calling function has a value in %rdi that is passed in to read_six_numbers without modification. If I had to guess, I would say that this value answers your question, so we'd need to see the caller of this phase_2 function to gain any further insight.

0x0000000000400f1a <+14>:    lea    0x4(%rsp),%rbx

read_six_numbers read 6 32-bit numbers for a total of 24 bytes. The starting number is at 0x0(%rsp) and lea gives us the address of a particular value. This therefore gives us a pointer to the second value in the array and puts it into %rbx.

0x0000000000400f1f <+19>:    lea    0x18(%rsp),%rbp

The first value of the array is at 0x0(%rsp) and the 6th is at 0x14(%rsp); 0x18(%rbp) is the first size-aligned address at the end of our array.

0x0000000000400f24 <+24>:    mov    -0x4(%rbx),%eax
0x0000000000400f27 <+27>:    add    $0x5,%eax
0x0000000000400f2a <+30>:    cmp    %eax,(%rbx)
0x0000000000400f2c <+32>:    je     0x400f33 <phase_2+39>
0x0000000000400f2e <+34>:    callq  0x4016c3 <explode_bomb>
0x0000000000400f33 <+39>:    add    $0x4,%rbx
0x0000000000400f37 <+43>:    cmp    %rbp,%rbx
0x0000000000400f3a <+46>:    jne    0x400f24 <phase_2+24>

User chqrlie explained this loop sufficiently well. If the previous (-0x4(%rbx)) and current + 5 are equal, we continue the loop. Otherwise we call explode_bomb. I'd add that although chqrlie says it takes no arguments, there is no guarantee that it doesn't. We haven't actually touched %rdi or %rsi, so that context is still available for it to use. To assert that explode_bomb takes no arguments, we'd have to see its disassembly; this context does not prove it takes no arguments.

However, the actual values being compared are undefined in this context. We're just looping over memory here.

0x0000000000400f3c <+48>:    add    $0x28,%rsp
0x0000000000400f40 <+52>:    pop    %rbx
0x0000000000400f41 <+53>:    pop    %rbp

This restores the caller context (remember we callee-saved the state of the caller's stack at the beginning) and...

0x0000000000400f42 <+54>:    retq

returns to the next IP of the caller.

Maybe there's something here you didn't already know. Otherwise, just a long-winded explanation to tell you what chqrlie already did: the starting value of the loop is 4 bytes past the base of the array filled in by read_six_numbers.

Upvotes: 3

chqrlie
chqrlie

Reputation: 145297

Function read_six_numbers receives the address of the array where to store the numbers in register %rsi. %rsi is set to point at a location at the bottom of the stack (%rsp), where some space was allocated with sub $0x28,%rsp. The loop at 0x400f24 uses register %rbx as a pointer that points into the array, starting one past the beginning. It checks if the previous value + 5 is equal to the current. If not, it calls explode_bomb() with no arguments. The loop iterates 5 times, until the pointer points to the end of the array.

Upvotes: 5

Related Questions