user628544
user628544

Reputation: 229

ld linker error - Undefined symbols for architecture x86_64

I'm trying to link a single-module assembly language program assembled with yasm and I get the following error from ld:

Undefined symbols for architecture x86_64:
  "start", referenced from:
     implicit entry/start for main executable
     (maybe you meant: _start)
ld: symbol(s) not found for inferred architecture x86_64

I actually get this error on a semi-regular basis, so I imagine it's a fairly common problem, but somehow no one seems to have a satisfactory answer. Before anyone says this is a duplicate of a previous question, yeah, I know. Just as you can look at the huge text-wall of similarly-titled questions and see that this is a duplicate, so can I.

Compiler Error: Undefined symbols for architecture x86_64

Not applicable to my problem. I'm not coding in C++, and the solution given in that question is idiosyncratic to that language.

undefined symbol for architecture x86_64 in compiling C program

Also doesn't fix my problem, as I'm not trying to link multiple object files together.

Error Undefined symbols for architecture x86_64:

Solution has to do with a specific framework in a high-level language.

Compiler Error: Undefined symbols for architecture x86_64

Solution involves fixing a function prototype. Not applicable here for obvious reasons.

... You get the idea. Every past question I can find is solved by some idiosyncratic method that isn't applicable to my situation.

Please help me with this. I am so tired of getting this error time and time again and not being able to do anything about it because it's so poorly documented. IMHO the world desperately needs a GNU Dev Tools equivalent of the MS-DOS error code reference manual.

Additional information:

Operating system: Mac OS X El Capitain

Source listing:

segment .text
global _start

_start:
    mov     eax,1   ; 1 is the syscall number for exit
    mov     ebx,5   ; 5 is the value to return
    int     0x80    ; execute a system call

Hexdump of the object file, showing that the symbol is indeed _start and not start:

00000000  cf fa ed fe 07 00 00 01  03 00 00 00 01 00 00 00  |................|
00000010  02 00 00 00 b0 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  19 00 00 00 98 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  0c 00 00 00 00 00 00 00  d0 00 00 00 00 00 00 00  |................|
00000050  0c 00 00 00 00 00 00 00  07 00 00 00 07 00 00 00  |................|
00000060  01 00 00 00 00 00 00 00  5f 5f 74 65 78 74 00 00  |........__text..|
00000070  00 00 00 00 00 00 00 00  5f 5f 54 45 58 54 00 00  |........__TEXT..|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000090  0c 00 00 00 00 00 00 00  d0 00 00 00 00 00 00 00  |................|
000000a0  00 00 00 00 00 00 00 00  00 00 00 80 00 00 00 00  |................|
000000b0  00 00 00 00 00 00 00 00  02 00 00 00 18 00 00 00  |................|
000000c0  dc 00 00 00 01 00 00 00  ec 00 00 00 08 00 00 00  |................|
000000d0  b8 01 00 00 00 bb 05 00  00 00 cd 80 01 00 00 00  |................|
000000e0  0f 01 00 00 00 00 00 00  00 00 00 00 00 5f 73 74  |............._st|
000000f0  61 72 74 00                                       |art.|
000000f4

Upvotes: 2

Views: 5851

Answers (1)

Michael Petch
Michael Petch

Reputation: 47553

32-bit OS/X Code Making System Calls via int 0x80

The code:

segment .text
global _start

_start:
    mov     eax,1   ; 1 is the syscall number for exit
    mov     ebx,5   ; 5 is the value to return
    int     0x80    ; execute a system call

Suggests you are using a 32-bit Linux tutorial. I make this conclusion since the 32-bit Linux ABI uses registers to pass arguments to the kernel via int 0x80. OS/X is different. You pass the arguments on the stack (passing them right to left). In 32-bit OS/X it would look like:

global start

section .text
start:
    ; sys_write syscall
    ; See: https://opensource.apple.com/source/xnu/xnu-1504.3.12/bsd/kern/syscalls.master
    ; 4 AUE_NULL ALL { user_ssize_t write(int fd, user_addr_t cbuf, user_size_t nbyte); }
    push    dword msg.len  ; Last argument is length
    push    dword msg      ; 2nd last is pointer to string
    push    dword 1        ; 1st argument is File descriptor (1=STDOUT)
    mov     eax, 4         ; eax = 4 is write system call
    sub     esp, 4         ; On OS/X 32-bit code always need to allocate 4 bytes on stack
    int     0x80

    ; sys_exit
    ; 1 AUE_EXIT ALL { void exit(int rval); }
    push    dword 42       ; Return value
    mov     eax, 1         ; eax=1 is exit system call
    sub     esp, 4         ; allocate 4 bytes on stack
    int     0x80

section .rodata

msg:    db      "Hello, world!", 10
.len:   equ     $ - msg

Assemble and link with:

nasm -f macho testexit.asm
ld -macosx_version_min 10.7.0 -o testexit testexit.o
./testexit
echo $?

YASM parameters should be the same as NASM. It should output:

Hello, world!
42

Rules of thumb for system calls in 32-bit OS/X code:

  • Parameters are passed right to left on the stack
  • int 0x80 does not need to have a 16-bytes aligned stack
  • An additional 4 bytes need to be allocated on stack after the parameters are pushed and before the system call. Examples:

    1. sub esp, 4
    2. push eax
  • System call number in the EAX register

  • System call initiated via int 0x80

The OS/X system calls are documented by Apple on their website.


64-bit OS/X Code Making System Calls via SYSCALL instruction

64-bit OS/X pretty much uses the same kernel calling convention as 64-bit Linux. The 64-bit Linux System V ABI applies for the System Calls. In particular the section A.2 AMD64 Linux Kernel Conventions. That section has these rules:

  1. User-level applications use as integer registers for passing the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
  2. A system-call is done via the syscall instruction. The kernel destroys registers %rcx and %r11.
  3. The number of the syscall has to be passed in register %rax.
  4. System-calls are limited to six arguments, no argument is passed directly on the stack.
  5. Returning from the syscall, register %rax contains the result of the system-call. A value in the range between -4095 and -1 indicates an error, it is -errno.
  6. Only values of class INTEGER or class MEMORY are passed to the kernel.

64-bit OS/X uses the same System Call numbers as 32-bit OS/X, however all the numbers have to have 0x02000000 added to them. The code above can be modified to work as a 64-bit OS/X program:

global start
section .text

start:
    mov     eax, 0x2000004 ; write system call
    mov     edi, 1         ; stdout = 1
    mov     rsi, msg       ; address of the message to print
    ;lea     rsi, [rel msg]; Alternative way using RIP relative addressing
    mov     edx, msg.len   ; length of message
    syscall                ; Use syscall, NOT int 0x80

    mov     eax, 0x2000001 ; exit system call
    mov     edi, 42        ; return 42 when exiting
    syscall

section .rodata

msg:    db      "Hello, world!", 10
.len:   equ     $ - msg

Please note that when writing to a 32-bit register, the CPU automatically zero extends to the 64-bit register. The code above uses this feature by writing to registers like EAX, EDI instead of RAX and RDI. You could have used the 64-bit registers but using the 32-bit registers saves a byte in the code.

Assemble and link with:

nasm -f macho64 testexit64.asm
ld -macosx_version_min 10.7.0 -lSystem -o testexit64 testexit64.o
./testexit64 
echo $?

It should output:

Hello, world!
42

Note: Some of this information is similar in nature to this OS/X tutorial with some corrections and coding bugs fixed.

Upvotes: 8

Related Questions