Reputation: 15
I'm practicing reverse engineering C object files. Suppose I have an object file of the C program:
#include <stdio.h>
#include <string.h>
int main (int argc, char ** argv) {
char * input = argv[1];
int result = strcmp(input, "text_to_compare");
if (result == 0) {
printf("%s\n", "text matches");
}
else {
printf("%s\n", "text doeesn't match");
}
return 0;
}
How would I go about finding "text_to_compare" from the object file given it was compiled with a -g flag and an x86-64 architecture?
Upvotes: 0
Views: 1058
Reputation: 311516
Running strings
on a binary file will all sequences of four or more printable characters in the file. For a simple file this might be sufficient, but for a larger file you can end up with a lot of false positives. For example, compiling your code with gcc
and running strings
on the resulting binary will return 295 results.
We can start by using the objdump
command to disassemble the code in your sample file:
$ objdump --disassemble=main a.out
a.out: file format elf64-x86-64
Disassembly of section .init:
Disassembly of section .plt:
Disassembly of section .text:
0000000000401136 <main>:
401136: 55 push %rbp
401137: 48 89 e5 mov %rsp,%rbp
40113a: 48 83 ec 20 sub $0x20,%rsp
40113e: 89 7d ec mov %edi,-0x14(%rbp)
401141: 48 89 75 e0 mov %rsi,-0x20(%rbp)
401145: 48 8b 45 e0 mov -0x20(%rbp),%rax
401149: 48 8b 40 08 mov 0x8(%rax),%rax
40114d: 48 89 45 f8 mov %rax,-0x8(%rbp)
401151: 48 8b 45 f8 mov -0x8(%rbp),%rax
401155: be 10 20 40 00 mov $0x402010,%esi
40115a: 48 89 c7 mov %rax,%rdi
40115d: e8 de fe ff ff call 401040 <strcmp@plt>
401162: 89 45 f4 mov %eax,-0xc(%rbp)
401165: 83 7d f4 00 cmpl $0x0,-0xc(%rbp)
401169: 75 0c jne 401177 <main+0x41>
40116b: bf 20 20 40 00 mov $0x402020,%edi
401170: e8 bb fe ff ff call 401030 <puts@plt>
401175: eb 0a jmp 401181 <main+0x4b>
401177: bf 2d 20 40 00 mov $0x40202d,%edi
40117c: e8 af fe ff ff call 401030 <puts@plt>
401181: b8 00 00 00 00 mov $0x0,%eax
401186: c9 leave
401187: c3 ret
Disassembly of section .fini:
Looking at the disassembly, we can see a call to strcmp
at offset 40115d:
40115d: e8 de fe ff ff call 401040 <strcmp@plt>
If we look a couple of lines before that, we can see a instruction that is moving data from an address outside of this section (0x402010
):
401155: be 10 20 40 00 mov $0x402010,%esi
If we look at the output of objdump -h a.out
, we see that this address falls in the .rodata
section (we're looking for sections for which the given address is in the block of memory starting at the address in the VMA column):
$ objdump -h a.out
Idx Name Size VMA LMA File off Algn
[...]
15 .rodata 00000041 0000000000402000 0000000000402000 00002000 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
[...]
We can extract the data in that section using the objcopy
command:
$ objcopy -j .rodata -O binary a.out >(xxd -o 0x402000)
00402000: 0100 0200 0000 0000 0000 0000 0000 0000 ................
00402010: 7465 7874 5f74 6f5f 636f 6d70 6172 6500 text_to_compare.
00402020: 7465 7874 206d 6174 6368 6573 0074 6578 text matches.tex
00402030: 7420 646f 6565 736e 2774 206d 6174 6368 t doeesn't match
00402040: 00 .
And we can see that the string at address 0x402010
is text_to_compare
.
Upvotes: 4