Reputation: 229
I'm trying to link a single-module assembly language program assembled with yasm
and I get the following error from ld
:
Undefined symbols for architecture x86_64:
"start", referenced from:
implicit entry/start for main executable
(maybe you meant: _start)
ld: symbol(s) not found for inferred architecture x86_64
I actually get this error on a semi-regular basis, so I imagine it's a fairly common problem, but somehow no one seems to have a satisfactory answer. Before anyone says this is a duplicate of a previous question, yeah, I know. Just as you can look at the huge text-wall of similarly-titled questions and see that this is a duplicate, so can I.
Compiler Error: Undefined symbols for architecture x86_64
Not applicable to my problem. I'm not coding in C++, and the solution given in that question is idiosyncratic to that language.
undefined symbol for architecture x86_64 in compiling C program
Also doesn't fix my problem, as I'm not trying to link multiple object files together.
Error Undefined symbols for architecture x86_64:
Solution has to do with a specific framework in a high-level language.
Compiler Error: Undefined symbols for architecture x86_64
Solution involves fixing a function prototype. Not applicable here for obvious reasons.
... You get the idea. Every past question I can find is solved by some idiosyncratic method that isn't applicable to my situation.
Please help me with this. I am so tired of getting this error time and time again and not being able to do anything about it because it's so poorly documented. IMHO the world desperately needs a GNU Dev Tools equivalent of the MS-DOS error code reference manual.
Additional information:
Operating system: Mac OS X El Capitain
Source listing:
segment .text
global _start
_start:
mov eax,1 ; 1 is the syscall number for exit
mov ebx,5 ; 5 is the value to return
int 0x80 ; execute a system call
Hexdump of the object file, showing that the symbol is indeed _start
and not start
:
00000000 cf fa ed fe 07 00 00 01 03 00 00 00 01 00 00 00 |................|
00000010 02 00 00 00 b0 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020 19 00 00 00 98 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000040 0c 00 00 00 00 00 00 00 d0 00 00 00 00 00 00 00 |................|
00000050 0c 00 00 00 00 00 00 00 07 00 00 00 07 00 00 00 |................|
00000060 01 00 00 00 00 00 00 00 5f 5f 74 65 78 74 00 00 |........__text..|
00000070 00 00 00 00 00 00 00 00 5f 5f 54 45 58 54 00 00 |........__TEXT..|
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000090 0c 00 00 00 00 00 00 00 d0 00 00 00 00 00 00 00 |................|
000000a0 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00 00 |................|
000000b0 00 00 00 00 00 00 00 00 02 00 00 00 18 00 00 00 |................|
000000c0 dc 00 00 00 01 00 00 00 ec 00 00 00 08 00 00 00 |................|
000000d0 b8 01 00 00 00 bb 05 00 00 00 cd 80 01 00 00 00 |................|
000000e0 0f 01 00 00 00 00 00 00 00 00 00 00 00 5f 73 74 |............._st|
000000f0 61 72 74 00 |art.|
000000f4
Upvotes: 2
Views: 5851
Reputation: 47553
The code:
segment .text
global _start
_start:
mov eax,1 ; 1 is the syscall number for exit
mov ebx,5 ; 5 is the value to return
int 0x80 ; execute a system call
Suggests you are using a 32-bit Linux tutorial. I make this conclusion since the 32-bit Linux ABI uses registers to pass arguments to the kernel via int 0x80
. OS/X is different. You pass the arguments on the stack (passing them right to left). In 32-bit OS/X it would look like:
global start
section .text
start:
; sys_write syscall
; See: https://opensource.apple.com/source/xnu/xnu-1504.3.12/bsd/kern/syscalls.master
; 4 AUE_NULL ALL { user_ssize_t write(int fd, user_addr_t cbuf, user_size_t nbyte); }
push dword msg.len ; Last argument is length
push dword msg ; 2nd last is pointer to string
push dword 1 ; 1st argument is File descriptor (1=STDOUT)
mov eax, 4 ; eax = 4 is write system call
sub esp, 4 ; On OS/X 32-bit code always need to allocate 4 bytes on stack
int 0x80
; sys_exit
; 1 AUE_EXIT ALL { void exit(int rval); }
push dword 42 ; Return value
mov eax, 1 ; eax=1 is exit system call
sub esp, 4 ; allocate 4 bytes on stack
int 0x80
section .rodata
msg: db "Hello, world!", 10
.len: equ $ - msg
Assemble and link with:
nasm -f macho testexit.asm
ld -macosx_version_min 10.7.0 -o testexit testexit.o
./testexit
echo $?
YASM parameters should be the same as NASM. It should output:
Hello, world! 42
Rules of thumb for system calls in 32-bit OS/X code:
int 0x80
does not need to have a 16-bytes aligned stackAn additional 4 bytes need to be allocated on stack after the parameters are pushed and before the system call. Examples:
sub esp, 4
push eax
System call number in the EAX register
int 0x80
The OS/X system calls are documented by Apple on their website.
64-bit OS/X pretty much uses the same kernel calling convention as 64-bit Linux. The 64-bit Linux System V ABI applies for the System Calls. In particular the section A.2 AMD64 Linux Kernel Conventions. That section has these rules:
- User-level applications use as integer registers for passing the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
- A system-call is done via the syscall instruction. The kernel destroys registers %rcx and %r11.
- The number of the syscall has to be passed in register %rax.
- System-calls are limited to six arguments, no argument is passed directly on the stack.
- Returning from the syscall, register %rax contains the result of the system-call. A value in the range between -4095 and -1 indicates an error, it is -errno.
- Only values of class INTEGER or class MEMORY are passed to the kernel.
64-bit OS/X uses the same System Call numbers as 32-bit OS/X, however all the numbers have to have 0x02000000 added to them. The code above can be modified to work as a 64-bit OS/X program:
global start
section .text
start:
mov eax, 0x2000004 ; write system call
mov edi, 1 ; stdout = 1
mov rsi, msg ; address of the message to print
;lea rsi, [rel msg]; Alternative way using RIP relative addressing
mov edx, msg.len ; length of message
syscall ; Use syscall, NOT int 0x80
mov eax, 0x2000001 ; exit system call
mov edi, 42 ; return 42 when exiting
syscall
section .rodata
msg: db "Hello, world!", 10
.len: equ $ - msg
Please note that when writing to a 32-bit register, the CPU automatically zero extends to the 64-bit register. The code above uses this feature by writing to registers like EAX, EDI instead of RAX and RDI. You could have used the 64-bit registers but using the 32-bit registers saves a byte in the code.
Assemble and link with:
nasm -f macho64 testexit64.asm
ld -macosx_version_min 10.7.0 -lSystem -o testexit64 testexit64.o
./testexit64
echo $?
It should output:
Hello, world! 42
Note: Some of this information is similar in nature to this OS/X tutorial with some corrections and coding bugs fixed.
Upvotes: 8