midi
midi

Reputation: 472

Difference between position dependent and position independent code?

I understand that the current gcc compilers by default generate position independent code. However, to get an understanding of how position dependent code looked like, I compiled this

int Add(int x, int y) {
    return x+y;
}

int Subtract(int x, int y) {
    return x-y;
}

int main() {
    bool flag = false;
    int x=10,y=5,z;

    if (flag) {
        z = Add(x,y);
    }
    else {
        z = Subtract(x,y);
    }
}

as g++ -c check.cpp -no-pie. However, the generated code is identical with or without the -no-pie flag. <main+0x34> looks to be a relative offset.

  26:   55                      push   %rbp
  27:   48 89 e5                mov    %rsp,%rbp
  2a:   48 83 ec 10             sub    $0x10,%rsp
  2e:   c6 45 f3 00             movb   $0x0,-0xd(%rbp)
  32:   c7 45 f4 0a 00 00 00    movl   $0xa,-0xc(%rbp)
  39:   c7 45 f8 05 00 00 00    movl   $0x5,-0x8(%rbp)
  40:   80 7d f3 00             cmpb   $0x0,-0xd(%rbp)
  44:   74 14                   je     5a <main+0x34>
  46:   8b 55 f8                mov    -0x8(%rbp),%edx
  49:   8b 45 f4                mov    -0xc(%rbp),%eax
  4c:   89 d6                   mov    %edx,%esi
  4e:   89 c7                   mov    %eax,%edi
  50:   e8 00 00 00 00          callq  55 <main+0x2f>
  55:   89 45 fc                mov    %eax,-0x4(%rbp)
  58:   eb 12                   jmp    6c <main+0x46>
  5a:   8b 55 f8                mov    -0x8(%rbp),%edx
  5d:   8b 45 f4                mov    -0xc(%rbp),%eax
  60:   89 d6                   mov    %edx,%esi
  62:   89 c7                   mov    %eax,%edi
  64:   e8 00 00 00 00          callq  69 <main+0x43>
  69:   89 45 fc                mov    %eax,-0x4(%rbp)
  6c:   b8 00 00 00 00          mov    $0x0,%eax
  71:   c9                      leaveq 
  72:   c3                      retq

is the objdump in both cases for just the main. Am I not using the correct flag or is the assembly code supposed to be same for PIC and non-PIC for this code chunk. If it is supposed to be the same, could you please provide a snippet for which it isn't!

Upvotes: 3

Views: 695

Answers (1)

old_timer
old_timer

Reputation: 71566

You have to access items that are outside the module or section to see a difference.

unsigned int x;
void fun ( void )
{
    x = 5;
}

so this crosses over .text to .data.

position dependent.

00000000 <fun>:
   0:   e3a02005    mov r2, #5
   4:   e59f3004    ldr r3, [pc, #4]    ; 10 <fun+0x10>
   8:   e5832000    str r2, [r3]
   c:   e12fff1e    bx  lr
  10:   00000000

position independent

00000000 <fun>:
   0:   e3a02005    mov r2, #5
   4:   e59f3010    ldr r3, [pc, #16]   ; 1c <fun+0x1c>
   8:   e59f1010    ldr r1, [pc, #16]   ; 20 <fun+0x20>
   c:   e08f3003    add r3, pc, r3
  10:   e7933001    ldr r3, [r3, r1]
  14:   e5832000    str r2, [r3]
  18:   e12fff1e    bx  lr
  1c:   00000008
  20:   00000000

In the first case the linker will fill in the address to the memory location

   8:   e5832000    str r2, [r3]
   c:   e12fff1e    bx  lr
  10:   00000000  <--- here

the pc relative addressing from 4: to 10: is within the .text section so dependent or independent are fine.

   4:   e59f3004    ldr r3, [pc, #4]    ; 10 <fun+0x10>
   8:   e5832000    str r2, [r3]
   c:   e12fff1e    bx  lr
  10:   00000000

it gets the address to the external entity, filled in by the linker, and then directly access that item at that address.

   4:   e59f3010    ldr r3, [pc, #16]   ; 1c <fun+0x1c>
   8:   e59f1010    ldr r1, [pc, #16]   ; 20 <fun+0x20>
   c:   e08f3003    add r3, pc, r3
  10:   e7933001    ldr r3, [r3, r1]
  14:   e5832000    str r2, [r3]
  18:   e12fff1e    bx  lr
  1c:   00000008
  20:   00000000

is easier to see linked (-Ttext=0x1000 -Tdata=0x2000)

00001000 <fun>:
    1000:   e3a02005    mov r2, #5
    1004:   e59f3010    ldr r3, [pc, #16]   ; 101c <fun+0x1c>
    1008:   e59f1010    ldr r1, [pc, #16]   ; 1020 <fun+0x20>
    100c:   e08f3003    add r3, pc, r3
    1010:   e7933001    ldr r3, [r3, r1]
    1014:   e5832000    str r2, [r3]
    1018:   e12fff1e    bx  lr
    101c:   00010010
    1020:   0000000c

Disassembly of section .got:

00011024 <_GLOBAL_OFFSET_TABLE_>:
    ...
   11030:   00002000

Disassembly of section .bss:

00002000 <x>:
    2000:   00000000

(clearly I should have also specified where the GOT goes).

While the global offset table and .bss are different sections once linked they are fixed relative to each other. What position independence gives is the ability to move .bss (or .data, etc) relative to .text. So if you think about the position dependent solution, if .data were to move and you had say 1000 references sprinkled all through the binary, in order to move .bss you would have to patch every one of those.

Instead the global offset table here provides a single location where the address of the variable x lives, and all access to variable x will essentially use double indirection to access. It may not be obvious but a position dependent way to get at a table like this would be for the linker to fill in its absolute address, but that would not be independent and this was compiled to be independent so pc relative math has to be done to find the global offset table, so for this instruction set when executing the instruction at 0x100c the program counter is 0x100c+8.

    100c:   e08f3003    add r3, pc, r3

So we are adding 0x100C+8+0x00010010 = 0x11024 and adding 0x0000000c to that giving 0x11030. So compute the address to the GOT then the offset within that, and THAT gives us the address to the item. 0x2000. So you do the second indirection there to get at the item.

If you were to place .text at an address other than 0x1000 but don't move .bss that is fine this will all work so long that the GOT moves to the same relative offset from .text. If you were to leave .text but move .bss then you have to update the GOT, if you move .bss from 0x2000 to 0x3000 then that is a difference of +0x1000 so you then go through the GOT and add 0x1000 to each item to cover that difference.

Position independence essentially has to do double indirection instead of single indirection (or one more level than would have been needed for position dependent) in order to access distant items or items not position dependent relative to .text. Which means more code, more memory access. It is more code and slower.

For it to work .text reaching out to other .text items cant use fixed addresses it has to use indirect/computed addresses. Likewise the GOT as used here (by GNU) has to be at a fixed relative position to .text. Then from there you can move data relative to code and still access it. So you have to have some rules. .text being code and assumed read only cant support this offset table which needs to be in ram, so it cant simply be built into the .text section.

Upvotes: 2

Related Questions