Reputation: 12817
80483ed: b8 00 00 00 00 mov $0x0,%eax │~
80483f2: 83 c0 0f add $0xf,%eax │~
80483f5: 83 c0 0f add $0xf,%eax │~
80483f8: c1 e8 04 shr $0x4,%eax │~
80483fb: c1 e0 04 shl $0x4,%eax │~
80483fe: 29 c4 sub %eax,%esp
This is an assembly code snippet from the start of a main function of a crackme binary I objdump -d
ed. The eax
manipulation is very odd to me:
1. eax = 0
2. eax += 0xf
3. eax += 0xf // eax = 0x1e (30 decimal, 11110 in binary)
4. eax >>= 4 // eax = 1
5. eax <<= 4 // eax = 16 (0x10)
Is this some kind of a fast way of manipulating eax
that is good for some reason? Or is this just a confusing C code that was compiled without optimization in order to throw off the person trying to RE it?
Upvotes: 0
Views: 160
Reputation: 44046
You just fall victim to a mild form of obfuscation, specifically done to slow done the reverse engineering of the program.
Take this code for example:
It's from a real-world example: a VB61 packer used to deliver a malware (I don't remember which one, I think it was Gootkit).
In this specific screenshot, all the instructions are useless, but in the whole code you'll find a push <constant>
and pop <reg>
here and there - a silly way of doing mov <reg>, <constant>
.
That's just to slow down the analyst (and possibly throw off beginners).
As long as it's easy, you can translate the code in you mind but you may want to consider more sophisticated tools (like IDA or radare2) that allow you to comment and manipulate the code.
As the crackme difficulty increases, you should expect more obfuscation and tricks.
1This kind of packer ends up calling native code generated outside the VB6 compiler.
Upvotes: 4
Reputation: 364163
A compiler would never emit this with optimization enabled, it's clearly inefficient and written by hand as an exercise.
There's no plausible way a compiler made this asm even without optimization. Multiple additions within one expression would collapse to a single add
at compile time. Across separate statements, it would store/reload to memory. (Except with register unsigned tmp;
for GCC).
Subtracting it from the stack pointer means this would have to be in an alloca or a C99 VLA like char buf[tmp]
.
>>=4
/ <<=4
is not how GCC or clang make sure the alloca size is a multiple of 16: with optimization disabled: GCC uses an insane div
and imul
even though the size is a power of 2, clang uses a normal (a + 15) & -16
.
The 2nd add $0xf, %eax
combined with the 2 shifts to knock off the low 4 bits does actually implement that (size+15) & -16
calculation to round the allocation size up to the next multiple of 16. (Keeping the stack aligned, and thus also the allocation itself).
So it could be a correct implementation of the following source (with optimization disabled), but it's implausible because any sane compiler would know to use and $0xfffffff0, %eax
to clear the low bits instead of 2 shifts.
int foo(void) {
register unsigned a asm("eax")= 0; // otherwise GCC picks a call-preserved reg, EBX
//register unsigned a = 0;
a += 0xf;
a += 0xf;
a >>= 4; // include this manually instead of as part of alloca / VLA size calc
a <<= 4;
volatile char buf[a];
buf[0] = 0;
return buf[0];
}
This does get us two back-to-back add
instructions, because GCC still compiles every statement to a separate block of asm (for consistent debugging even if you used jump
in GDB to jump between source lines.) See Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?
https://godbolt.org/z/x8W6d9 - GCC10.2 -O0 -m32 -Wall
output contains some of your sequence, but not the sub %eax, %esp
right after the shift
foo:
push ebp
mov ebp, esp
push ebx
sub esp, 20
mov eax, esp
mov ecx, eax
mov eax, 0 # sequence starts here
add eax, 15
add eax, 15
shr eax, 4
sal eax, 4 # sal is a synonym for the same opcode as shl. Disassembly would normally show shl
# but that's as far as we can get
# VLA size calculation to align the VLA by 16, and the stack, not just sub from ESP.
mov edx, eax
sub edx, 1
mov DWORD PTR [ebp-12], edx
mov edx, 16
sub edx, 1
add eax, edx
mov ebx, 16
mov edx, 0
div ebx # yes really, GCC -O0 emits a div for a constant 16
imul eax, eax, 16
sub esp, eax
mov eax, esp
add eax, 0
mov DWORD PTR [ebp-16], eax
mov eax, DWORD PTR [ebp-16]
mov BYTE PTR [eax], 0
mov eax, DWORD PTR [ebp-16]
movzx eax, BYTE PTR [eax]
movsx eax, al # should have done a movsx load in the first place
mov esp, ecx # pointless; saved EBX addressed relative to EBP
mov ebx, DWORD PTR [ebp-4]
leave
ret
clang chooses to just ignore the register
keyword, keeping a
in memory between statements.
Upvotes: 3