Said Hamed
Said Hamed

Reputation: 15

Reverse engineering C object files

I'm practicing reverse engineering C object files. Suppose I have an object file of the C program:

#include <stdio.h>
#include <string.h>

int main (int argc, char ** argv) {
  char * input = argv[1];
  int result = strcmp(input, "text_to_compare");
  
  if (result == 0) {
      printf("%s\n", "text matches");
  }
  else {
      printf("%s\n", "text doeesn't match");
  }
  
  return 0;
}

How would I go about finding "text_to_compare" from the object file given it was compiled with a -g flag and an x86-64 architecture?

Upvotes: 0

Views: 1058

Answers (1)

larsks
larsks

Reputation: 311516

Running strings on a binary file will all sequences of four or more printable characters in the file. For a simple file this might be sufficient, but for a larger file you can end up with a lot of false positives. For example, compiling your code with gcc and running strings on the resulting binary will return 295 results.

We can start by using the objdump command to disassemble the code in your sample file:

$ objdump --disassemble=main a.out

a.out:     file format elf64-x86-64


Disassembly of section .init:

Disassembly of section .plt:

Disassembly of section .text:

0000000000401136 <main>:
  401136:       55                      push   %rbp
  401137:       48 89 e5                mov    %rsp,%rbp
  40113a:       48 83 ec 20             sub    $0x20,%rsp
  40113e:       89 7d ec                mov    %edi,-0x14(%rbp)
  401141:       48 89 75 e0             mov    %rsi,-0x20(%rbp)
  401145:       48 8b 45 e0             mov    -0x20(%rbp),%rax
  401149:       48 8b 40 08             mov    0x8(%rax),%rax
  40114d:       48 89 45 f8             mov    %rax,-0x8(%rbp)
  401151:       48 8b 45 f8             mov    -0x8(%rbp),%rax
  401155:       be 10 20 40 00          mov    $0x402010,%esi
  40115a:       48 89 c7                mov    %rax,%rdi
  40115d:       e8 de fe ff ff          call   401040 <strcmp@plt>
  401162:       89 45 f4                mov    %eax,-0xc(%rbp)
  401165:       83 7d f4 00             cmpl   $0x0,-0xc(%rbp)
  401169:       75 0c                   jne    401177 <main+0x41>
  40116b:       bf 20 20 40 00          mov    $0x402020,%edi
  401170:       e8 bb fe ff ff          call   401030 <puts@plt>
  401175:       eb 0a                   jmp    401181 <main+0x4b>
  401177:       bf 2d 20 40 00          mov    $0x40202d,%edi
  40117c:       e8 af fe ff ff          call   401030 <puts@plt>
  401181:       b8 00 00 00 00          mov    $0x0,%eax
  401186:       c9                      leave
  401187:       c3                      ret

Disassembly of section .fini:

Looking at the disassembly, we can see a call to strcmp at offset 40115d:

40115d:       e8 de fe ff ff          call   401040 <strcmp@plt>

If we look a couple of lines before that, we can see a instruction that is moving data from an address outside of this section (0x402010):

401155:       be 10 20 40 00          mov    $0x402010,%esi

If we look at the output of objdump -h a.out, we see that this address falls in the .rodata section (we're looking for sections for which the given address is in the block of memory starting at the address in the VMA column):

$ objdump -h a.out
Idx Name          Size      VMA               LMA               File off  Algn
[...]
 15 .rodata       00000041  0000000000402000  0000000000402000  00002000  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
[...]

We can extract the data in that section using the objcopy command:

$ objcopy -j .rodata -O binary a.out >(xxd -o 0x402000)
00402000: 0100 0200 0000 0000 0000 0000 0000 0000  ................
00402010: 7465 7874 5f74 6f5f 636f 6d70 6172 6500  text_to_compare.
00402020: 7465 7874 206d 6174 6368 6573 0074 6578  text matches.tex
00402030: 7420 646f 6565 736e 2774 206d 6174 6368  t doeesn't match
00402040: 00                                       .

And we can see that the string at address 0x402010 is text_to_compare.

Upvotes: 4

Related Questions