Michael
Michael

Reputation: 69

Confusing about the implementation of shared library in Linux

I'm doing some experiment about shared library in Linux. By reading several papers I think I know what happens when a shared library function is called.
But when I am trying to trace the memory to get the binary code in a shared library function, I find something strange. In my opinion, after calling a shared library function, the corresponding slot in .got.plt should contain the actual function address, but my experiment shows that it still remains the same, i.e the address of the second instruction in func@plt section. I'm rather confused about this, so if anyone could help me?
Here is my code and output:

#include <stdio.h>
#include <string.h>

typedef unsigned long u_l;

int main()
{    
    char *p_ch = strstr("abc", "b");
    printf("result = %s\n", p_ch);

    long long *p = (long long *) &strstr;

    printf("data = %llx\n", *(p));

    long long k = *p >> 16; 
    u_l *entry_addr = (u_l *)(k & 0x00000000ffffffff);

    printf("entry_addr = %lx\n", entry_addr);

    u_l *func_addr = (u_l *)*entry_addr;
    printf("func_addr = %lx\n", func_addr);
    printf("code = %llx\n", *func_addr);
    return 0;
}

output:

result = bc  
data = 680804a00c25ff  
entry_addr = 804a00c  
func_addr = 8048326  
code = 68080400000068  

Thanks first!
PS: Please don't ask me why I need to get the code of a shared library function. Of course I know the source code and the binary could be obtained easily. It's just a experiment.
My GCC version is 4.7.3. Kernel version is 3.8.0-35

Upvotes: 3

Views: 286

Answers (2)

Armali
Armali

Reputation: 19375

One thing else, I couldn't find printf in my libc.so, …

This program shows you an address and the containing library (using a Glibc extension) for each function given as an argument:

/* cc -ldl */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int main(int argc, char *argv[])
{
  while (*++argv)
  {
    void *handle = dlopen(NULL, RTLD_NOW);
    if (!handle) puts(dlerror()), exit(1);
    void *p = dlsym(handle, *argv);
    char *s = dlerror();
    if (s) puts(s), exit(1);
    printf("%s = %p\n", *argv, p);
    Dl_info info;
    if (dladdr(p, &info))
        printf("%s contains %s\n", info.dli_fname, info.dli_sname);
  }
}

Upvotes: 0

kestasx
kestasx

Reputation: 1091

Not sure what is the logic of Your program, but I'll try to show where address changes.

$ gcc -Wall -g test.c
$ gdb a.out
(gdb) break main
Breakpoint 1 at 0x40054c: file test.c, line 8.
(gdb) run
(gdb) disassemble
Dump of assembler code for function main:
   0x0000000000400544 <+0>: push   %rbp
   0x0000000000400545 <+1>: mov    %rsp,%rbp
   0x0000000000400548 <+4>: sub    $0x30,%rsp
=> 0x000000000040054c <+8>: movq   $0x4006fd,-0x28(%rbp)
   0x0000000000400554 <+16>:    mov    $0x400700,%eax
   0x0000000000400559 <+21>:    mov    -0x28(%rbp),%rdx
   0x000000000040055d <+25>:    mov    %rdx,%rsi
   0x0000000000400560 <+28>:    mov    %rax,%rdi
   0x0000000000400563 <+31>:    mov    $0x0,%eax
   0x0000000000400568 <+36>:    callq  0x400430 <printf@plt>
   0x000000000040056d <+41>:    movq   $0x400450,-0x20(%rbp)
   0x0000000000400575 <+49>:    mov    -0x20(%rbp),%rax
   0x0000000000400579 <+53>:    mov    (%rax),%rdx
   0x000000000040057c <+56>:    mov    $0x40070d,%eax
   0x0000000000400581 <+61>:    mov    %rdx,%rsi
   0x0000000000400584 <+64>:    mov    %rax,%rdi
   0x0000000000400587 <+67>:    mov    $0x0,%eax
   0x000000000040058c <+72>:    callq  0x400430 <printf@plt>
   0x0000000000400591 <+77>:    mov    -0x20(%rbp),%rax
   0x0000000000400595 <+81>:    mov    (%rax),%rax
   0x0000000000400598 <+84>:    sar    $0x10,%rax
   0x000000000040059c <+88>:    mov    %rax,-0x18(%rbp)

Let's make breakpoint in PLT table at printf entry (0x400430) and continue:

(gdb) break *0x400430
Breakpoint 2 at 0x400430
(gdb) continue 
Continuing.

Breakpoint 2, 0x0000000000400430 in printf@plt ()
(gdb) disassemble 
Dump of assembler code for function printf@plt:
=> 0x0000000000400430 <+0>: jmpq   *0x200bca(%rip)        # 0x601000 <[email protected]>
   0x0000000000400436 <+6>: pushq  $0x0
   0x000000000040043b <+11>:    jmpq   0x400420
End of assembler dump.
(gdb) x/x 0x601000
0x601000 <[email protected]>:  0x00400436

In PLT table You can see indirect jump by address stored in GOT at 0x601000 (0x200bca+0x400430+6), which at first function invocation resolves to next address in PLT (0x00400436: pushq and jump to dynamic linker). Dynamic linker finds real printf, updates it's GOT entry and jumps to it.

Next time You call the same printf function (and hit the breakpoint), it's entry at GOT 0x601000 is already updated to 0xf7a6d840, so there is jump directly to printf, not to dynamic linker.

(gdb) c
Continuing.
result = bc

Breakpoint 2, 0x0000000000400430 in printf@plt ()
(gdb) disassemble 
Dump of assembler code for function printf@plt:
=> 0x0000000000400430 <+0>: jmpq   *0x200bca(%rip)        # 0x601000 <[email protected]>
   0x0000000000400436 <+6>: pushq  $0x0
   0x000000000040043b <+11>:    jmpq   0x400420
End of assembler dump.
(gdb) x/x 0x601000
0x601000 <[email protected]>:  0xf7a6d840

This example is from 64bit Linux. On other *NIX'es assembly or similar details may vary, but idea remains the same.

Upvotes: 2

Related Questions