M42
M42

Reputation: 9

Reading x86 assembly code

I am working through a lab where I have to defuse a "bomb" by providing the correct input for each phase. I do not have access to the source code, so I have to step through the assembly code with GDB. Right now, I'm stuck on phase 2 and would really appreciate some help. Here is the x86 assembly code - I've added some comments that describe what I think is happening, but these could be horribly wrong because we only started learning assembly code a few days ago and I'm still quite shaky. As far as I can tell right now, this phase reads in 6 numbers from the user (that's what read_six_numbers does) and seems to go through some type of loop.

0000000000400f03 <phase_2>:
400f03: 41 55                   push   %r13                         // save values
400f05: 41 54                   push   %r12
400f07: 55                      push   %rbp
400f08: 53                      push   %rbx
400f09: 48 83 ec 28             sub    $0x28,%rsp                  // decrease stack pointer
400f0d: 48 89 e6                mov    %rsp,%rsi                   // move rsp to rsi
400f10: e8 5a 07 00 00          callq  40166f <read_six_numbers>   // read in six numbers from the user
400f15: 48 89 e3                mov    %rsp,%rbx                   // move rsp to rbx
400f18: 4c 8d 64 24 0c          lea    0xc(%rsp),%r12              // ?
400f1d: bd 00 00 00 00          mov    $0x0,%ebp                   // set ebp to 0?
400f22: 49 89 dd                mov    %rbx,%r13                   // move rbx to r13
400f25: 8b 43 0c                mov    0xc(%rbx),%eax              // ?
400f28: 39 03                   cmp    %eax,(%rbx)                 // compare eax and rbx
400f2a: 74 05                   je     400f31 <phase_2+0x2e>       // if equal, skip explode 
400f2c: e8 1c 07 00 00          callq  40164d <explode_bomb>       // bomb detonates (fail)
400f31: 41 03 6d 00             add    0x0(%r13),%ebp              // add r13 and ebp (?)
400f35: 48 83 c3 04             add    $0x4,%rbx                   // add 4 to rbx
400f39: 4c 39 e3                cmp    %r12,%rbx                   // compare r12 and rbx
400f3c: 75 e4                   jne    400f22 <phase_2+0x1f>       // loop? if not equal, jump to 400f22 
400f3e: 85 ed                   test   %ebp,%ebp                   // compare ebp with itself?
400f40: 75 05                   jne    400f47 <phase_2+0x44>       // skip explosion if not equal 
400f42: e8 06 07 00 00          callq  40164d <explode_bomb>       // bomb detonates (fail)
400f47: 48 83 c4 28             add    $0x28,%rsp
400f4b: 5b                      pop    %rbx
400f4c: 5d                      pop    %rbp
400f4d: 41 5c                   pop    %r12
400f4f: 41 5d                   pop    %r13
400f51: c3                      retq  

Any help is greatly appreciated - especially advice on how I would go about translating something like this into C code. Thanks in advance!

Upvotes: 0

Views: 2076

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 365577

especially advice on how I would go about translating something like this into C code

Don't literally translate it into C.

Learn to think in terms of how algorithms are implemented in terms of changes to registers and memory. C and asm are just different ways of expressing what you actually want the machine to do.

Every instruction makes a well-defined modification to the architectural state of the machine, so just follow that chain of steps and see the result. Any good debugger (e.g. gdb in layout reg mode) can show you which register was modified as you single-step. The insn ref manual (links in the tag wiki) has full documentation on exactly what every instruction does.

If you're ever surprised by something, look it up. There are many SO questions from people that didn't do that, and then posted silly questions about div results when they didn't set rdx first.


You need to find connections between insns that modify or overwrite a register or memory location, and later instructions that read from that register or memory location.


You can often get clues from how a register is being used, e.g. add $0x4,%rbx is probably a pointer increment to an int *. It's rare to increment a 64bit integer by 4 if it isn't a pointer, or involved in memory addressing somehow.

If you look at surrounding code and find mov 0xc(%rbx),%eax (loading 4B from an offset from %rbx), that confirms the theory that it's a pointer.

The cmp %r12,%rbx / jcc tells you that it's also part of the loop condition, and that %r12 is the end pointer. You check it's just a simple do{}while(p < end) loop by verifying that %r12 isn't modified in the loop, and that it's initialized to something sensible before the loop.


mov $0x0,%ebp tells you that this is compiler output from -O0 or -O1, because every x86 compiler knows the "peephole" optimization that xor %ebp,%ebp is the best way to zero registers. Fortunately this looks like -O1 compiler output, so it doesn't store everything to memory after every C statement and reload after. That makes code that's hard to follow, because a value doesn't stay live in the same register for long.


If you have any specific questions about that binary bomb code, you should ask them. I just answered the "how to read asm" part.

Upvotes: 2

Related Questions