Reputation: 329
I am having trouble to understand what the assembly code below does as I convert it to C. I know it is a loop, but I don't know where to start in converting it.
I kinda understand that the input has to be 6 numbers, and that inside the loop it will add 5 and compare.
I'm mostly stuck on how we know the starting value?
0x0000000000400f15 <+9>: callq 0x4016e5 <read_six_numbers>
0x0000000000400f1a <+14>: lea 0x4(%rsp),%rbx
0x0000000000400f1f <+19>: lea 0x18(%rsp),%rbp
0x0000000000400f24 <+24>: mov -0x4(%rbx),%eax
0x0000000000400f27 <+27>: add $0x5,%eax
0x0000000000400f2a <+30>: cmp %eax,(%rbx)
0x0000000000400f2c <+32>: je 0x400f33 <phase_2+39>
0x0000000000400f2e <+34>: callq 0x4016c3 <explode_bomb>
0x0000000000400f33 <+39>: add $0x4,%rbx
0x0000000000400f37 <+43>: cmp %rbp,%rbx
0x0000000000400f3a <+46>: jne 0x400f24 <phase_2+24>
Upvotes: 3
Views: 1200
Reputation: 2370
There are a few things to consider here that aren't specified in your question (ABI, processor architecture, executable file format, etc). Not all of these are necessary to answer your question, but it's likely that understanding this will improve your general understanding of how functions, methods, or procedures are called in a wide range of executable contexts.
In particular, different CPU architectures, operating systems, and even executable binary formats may have different signatures for handling program inputs. Because it is apparent that you are using an AMD64-architecture CPU, you may find this wikipedia page useful. In particular, it seems you are using the "System V x86-64 ABI" based on some context in your snippet. (We'll do a full analysis of your snippet later.)
The C programming language does have any concept of a stack, so although that is relevant in terms of your snippet, it is not a requirement for C programs and it is likely that a portable version of your program may not use the stack at all. Indeed, although introductory compiler courses still seem to tend to use a stack to pass state between call frames, the stack is not generally used in the SysV ABI on AMD64.
(It was much more common to do this on x86 as the 32-bit architecture is register-constrained. The overhead of using registers to pass state on such an architecture is likely to be higher as it's likely the registers will need to be copied to the stack so that they can be reused and because it's likely additional function calls will need them preserved.)
The SysV ABI in particular uses %rdi
, %rsi
, %rdx
, %rcx
, %r8
, %r9
, and %xmm0-7
, in that order.
0x0000000000400f0c <+0>: push %rbp
0x0000000000400f0d <+1>: push %rbx
This preserves the stack frame of the caller by pushing the registers representing the stack frame to the top of the stack. %rbp
and %rbx
are "callee-save" registers, which means that the function called must preserve their value as the caller needs their values to preserve its state.
0x0000000000400f0e <+2>: sub $0x28,%rsp
This allocates 40 bytes of space on the stack. Why 40 bytes? We've already pushed 16 bytes onto the stack preserving %rbp
and %rbx
. We need an additional 24 bytes for our scratch space for read_six_numbers
, so 16 + 24 == 40.
0x0000000000400f12 <+6>: mov %rsp,%rsi
This moves the base address of the stack into %rsi
. Now, because I'm assuming SysV ABI, this means that the address is actually the second argument to the function we're about to call. The contents of this space are undefined and are likely to be random values. This is scratch space used by read_six_numbers
.
0x0000000000400f15 <+9>: callq 0x4016e5 <read_six_numbers>
This calls the function read_six_numbers
. Since our scratch space is the second argument (as per SysV ABI), this means our calling function has a value in %rdi
that is passed in to read_six_numbers
without modification. If I had to guess, I would say that this value answers your question, so we'd need to see the caller of this phase_2
function to gain any further insight.
0x0000000000400f1a <+14>: lea 0x4(%rsp),%rbx
read_six_numbers
read 6 32-bit numbers for a total of 24 bytes. The starting number is at 0x0(%rsp)
and lea
gives us the address of a particular value. This therefore gives us a pointer to the second value in the array and puts it into %rbx
.
0x0000000000400f1f <+19>: lea 0x18(%rsp),%rbp
The first value of the array is at 0x0(%rsp)
and the 6th is at 0x14(%rsp)
; 0x18(%rbp)
is the first size-aligned address at the end of our array.
0x0000000000400f24 <+24>: mov -0x4(%rbx),%eax
0x0000000000400f27 <+27>: add $0x5,%eax
0x0000000000400f2a <+30>: cmp %eax,(%rbx)
0x0000000000400f2c <+32>: je 0x400f33 <phase_2+39>
0x0000000000400f2e <+34>: callq 0x4016c3 <explode_bomb>
0x0000000000400f33 <+39>: add $0x4,%rbx
0x0000000000400f37 <+43>: cmp %rbp,%rbx
0x0000000000400f3a <+46>: jne 0x400f24 <phase_2+24>
User chqrlie explained this loop sufficiently well. If the previous (-0x4(%rbx)
) and current + 5 are equal, we continue the loop. Otherwise we call explode_bomb
. I'd add that although chqrlie says it takes no arguments, there is no guarantee that it doesn't. We haven't actually touched %rdi
or %rsi
, so that context is still available for it to use. To assert that explode_bomb
takes no arguments, we'd have to see its disassembly; this context does not prove it takes no arguments.
However, the actual values being compared are undefined in this context. We're just looping over memory here.
0x0000000000400f3c <+48>: add $0x28,%rsp
0x0000000000400f40 <+52>: pop %rbx
0x0000000000400f41 <+53>: pop %rbp
This restores the caller context (remember we callee-saved the state of the caller's stack at the beginning) and...
0x0000000000400f42 <+54>: retq
returns to the next IP of the caller.
Maybe there's something here you didn't already know. Otherwise, just a long-winded explanation to tell you what chqrlie already did: the starting value of the loop is 4 bytes past the base of the array filled in by read_six_numbers
.
Upvotes: 3
Reputation: 145297
Function read_six_numbers
receives the address of the array where to store the numbers in register %rsi
. %rsi
is set to point at a location at the bottom of the stack (%rsp
), where some space was allocated with sub $0x28,%rsp
. The loop at 0x400f24
uses register %rbx
as a pointer that points into the array, starting one past the beginning. It checks if the previous value + 5 is equal to the current. If not, it calls explode_bomb()
with no arguments. The loop iterates 5 times, until the pointer points to the end of the array.
Upvotes: 5