How is main() called? Call to main() inside __libc_start_main()

Question

I am trying to understand the call to main() inside __libc_start_main(). I know one of the parameters of __libc_start_main() is the address of main(). But, I am not able to figure out how is main() being called inside __libc_start_main() as there is no Opcode CALL or JMP. I see the following disassembly right before execution jumps to main().

   0x7ffff7ded08b <__libc_start_main+203>:  lea    rax,[rsp+0x20]
   0x7ffff7ded090 <__libc_start_main+208>:  mov    QWORD PTR fs:0x300,rax
=> 0x7ffff7ded099 <__libc_start_main+217>:  mov    rax,QWORD PTR [rip+0x1c3e10]        # 0x7ffff7fb0eb0

I wrote a simple "Hello, World!!" in C. In the assembly above:

The execution jumps to main() right after instruction at address 0x7ffff7ded099.
Why is the MOV (to RAX) instruction causing a jump to main()?

Marco Bonelli · Accepted Answer

Well, of course those instructions are not the ones that cause the call to main. I am not sure how you are stepping through those instructions, but if you are using GDB, you should use stepi instead of nexti.

I don't know why this happens precisely (some strange GDB or x86 quirk?) so I only speak from personal experience, but when reverse-engineering ELF binaries, I occasionally find that the nexti command executes several instructions before breaking. In your case, it misses a few movs before the actual call rax to call main().

What you can do to remediate this is to either use stepi, or to dump more code and then explicitly tell GDB to set breakpoints:

(gdb) x/20i
    0x7ffff7ded08b <__libc_start_main+203>: lea    rax,[rsp+0x20]
    0x7ffff7ded090 <__libc_start_main+208>: mov    QWORD PTR fs:0x300,rax
 => 0x7ffff7ded099 <__libc_start_main+217>: mov    rax,QWORD PTR [rip+0x1c3e10]        # 0x7ffff7fb0eb0
    ... more lines ...
    ... find call rax ...
(gdb) b *0x7ffff7dedXXX <= replace this
(gdb) continue

Here's what __libc_start_main() on my system does to call main():

21b6f:  48 8d 44 24 20          lea    rax,[rsp+0x20]               ; start preparing args
21b74:  64 48 89 04 25 00 03    mov    QWORD PTR fs:0x300,rax
21b7b:  00 00
21b7d:  48 8b 05 24 93 3c 00    mov    rax,QWORD PTR [rip+0x3c9324] 
21b84:  48 8b 74 24 08          mov    rsi,QWORD PTR [rsp+0x8]
21b89:  8b 7c 24 14             mov    edi,DWORD PTR [rsp+0x14]
21b8d:  48 8b 10                mov    rdx,QWORD PTR [rax]
21b90:  48 8b 44 24 18          mov    rax,QWORD PTR [rsp+0x18]     ; get address of main
21b95:  ff d0                   call   rax                          ; actual call to main()
21b97:  89 c7                   mov    edi,eax
21b99:  e8 32 16 02 00          call   431d0     ; exit(result of main)

The first three instructions are the same that you show. At the moment of call rax, rax will contain the address of main. After calling main, the result is moved into edi (first argument) and exit(result) is called.

Looking at glibc's source code for __libc_start_main(), we can see that this is exactly what happens:

/* ... */

#ifdef HAVE_CLEANUP_JMP_BUF
  int not_first_call;
  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
  if (__glibc_likely (! not_first_call))
    {
      /* ... a bunch of stuff ... */
      /* Run the program.  */
      result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
    }
  else
    {
      /* ... a bunch of stuff ... */
    }
#else
  /* Nothing fancy, just call the function.  */
  result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
#endif
  exit (result);
}

In my case I can see from the disassembly that HAVE_CLEANUP_JMP_BUF was defined when my glibc was compiled, so the actual call to main() is the one inside the if. I also suspect this is the case for your glibc.

How is main() called? Call to main() inside __libc_start_main()

Answers (1)

Related Questions