Araeos
Araeos

Reputation: 171

Unrelocated address when linking with link.exe

Problem

When I compile my assembly code with as (binutils) and link using link.exe (Visual Studio 2015) the program crashes because of an unrelocated address.

When linking with gcc (gcc hello-64-gas.obj -o hello-64-gas.exe) the program runs correctly without crash though. Am I correctly assuming that the object file generated by as should be compiler independent, since abi compatibility problems are in the hands of the assembly code writer? Since I am a beginner, any explanation of my mistakes/incorrect assumptions is appreciated.

Platform

Example

The following code does not link correctly:

# hello-64-gas.asm    print a string using printf
# Assemble:   as hello-64-gas.asm -o hello-64-gas.obj --64
# Link:       link -subsystem:CONSOLE hello-64-gas.obj -out:hello-64-gas.exe libcmt.lib libvcruntime.lib libucrt.lib legacy_stdio_definitions.lib
.intel_syntax noprefix

.global main

# Declare needed C  functions
.extern printf

.section .data
msg:       .asciz "Hello world"
fmt:       .asciz "%s(%d; %f)\n"
myDouble:   .double 2.33, -1.0

.text
main:
    sub rsp, 8*5
    mov rcx, offset flat: fmt
    mov rdx, offset flat: msg
    mov r8, 0xFF
    mov r9, offset flat: myDouble
    mov r9, [r9]
    movq xmm4, r9
    call printf
    add rsp, 8*5

    mov rax, 0
    ret

When debugging it seems mov r9, offset flat: myDouble is not relocated: mov r9,18h, where 18h would be correct if the .data section where at position zero. Looking at the relocation table with objdump -dr hello-64-gas.obj yields:

...
19:   49 c7 c1 18 00 00 00    mov    $0x18,%r9
                      1c: R_X86_64_32S        .data
...

Variation (workaround?)

Replacing mov with movabs seems to work:

# hello-64-gas.asm    print a string using printf
# Assemble:       as hello-64-gas.asm -o hello-64-gas.obj --64
# Link:           link -subsystem:CONSOLE hello-64-gas.obj -out:hello-64-gas.exe libcmt.lib libvcruntime.lib libucrt.lib legacy_stdio_definitions.lib
.intel_syntax noprefix

.global main

# Declare needed C  functions
.extern printf

.section .data
msg:       .asciz "Hello world"
fmt:       .asciz "%s(%d; %f)\n"
myDouble:   .double 2.33, -1.0

.text
main:
    sub rsp, 8*5
    movabs rcx, offset flat: fmt
    movabs rdx, offset flat: msg
    mov r8, 0xFF
    movabs r9, offset flat: myDouble
    mov r9, [r9]
    movq xmm4, r9
    call printf
    add rsp, 8*5

    mov rax, 0
    ret

This does somehow run correctly when linked using link.exe.

Upvotes: 3

Views: 386

Answers (1)

Ross Ridge
Ross Ridge

Reputation: 39591

The relocation that the GNU assembler is using for your references to myDouble, along with fmt and msg, isn't supported by Microsoft's linker. This relocation, called R_X86_64_32S by the GNU utilities and having a value of 0x11, isn't documented in Microsoft's PECOFF specification. As can be evidenced by using Microsoft's DUMPBIN on your object file, Microsoft's linker seems to use relocations with this value for some other undocumented purpose:

RELOCATIONS #1
                                                Symbol    Symbol
 Offset    Type              Applied To         Index     Name
 --------  ----------------  -----------------  --------  ------
 00000007  EHANDLER                                    7  .data
 0000000E  EHANDLER                                    7  .data
 0000001C  EHANDLER                                    7  .data
 00000029  REL32                      00000000         C  printf

As work around you can use either use:

  • a LEA instruction with RIP relative addressing, which generates a R_X86_64_PC32/REL32 relocation
  • as you found out yourself, a MOVABS instruction, which generates a R_X86_64_64/ADDR64 relocation
  • a 32-bit MOV instruction which generates a R_X86_64_32/ADDR32 relocation

In order these would be written as:

lea r9, [rip + myDouble]
movabs r9, offset myDouble
mov r9d, offset myDouble

These, along with mov r9, offset myDouble, are four different instructions with different encodings and subtly different semantics each requiring a different type of relocation.

The LEA instruction encodes myDouble as a 32-bit signed offset relative to RIP. This is the preferable instruction to use here, as it takes only 4 bytes to encode the address and it allows the executable to be loaded anywhere in the 64-bit address space. The only limitation is that executable needs to be less than 2G in size, but this is a fundamental limitation x64 PECOFF executables anyways.

The MOVABS encodes myDouble as a 64-bit absolute address. While in theory this allows myDouble to be located anywhere in the 64-bit address space, even more than 2G away from the instruction, it takes 8 bytes of encoding space and doesn't actually get you anything under Windows.

The 32-bit MOV instruction encodes myDouble as an unsigned 32-bit absolute address. It has the disadvantage of requiring the the executable to be loaded somewhere in the first 4G of address space. Because of this you need to use the /LARGEADDRESSAWARE:NO flag with the Microsoft linker otherwise you'll get an error.

The 64-bit MOV instruction you're using encodes myDouble as a 32-bit signed absolute address. This also limits where the executable can be loaded, and requires a type of relocation that Microsoft's PECOFF format isn't documented as having and isn't supported by Microsoft's linker.

Upvotes: 5

Related Questions