Reputation: 151
I've been working through Assembly Language Step by Step: Third Edition and am in the final chapter "Heading out to C". I'm trying to get a consistent method of converting the 32-bit code which calls the C library (glibc) function puts
on my 64-bit Ubuntu system. (I would like to follow through the last 50 pages of the text which presumably head deeper into the C [more geeky puns] but from an assembly base written with 32-bit code). The code is:
SECTION .data ; Section containing initialised data
EatMsg: db "Eat at Joe's!",0
SECTION .text ; Section containing code
extern puts ; Simple "put string" routine from clib
global main ; Required so linker can find entry point
main:
push ebp ; Set up stack frame for debugger
mov ebp,esp
push ebx ; Must preserve ebp, ebx, esi, & edi
push esi
push edi
;;; Everything before this is boilerplate; use it for all ordinary apps!
push EatMsg ; Push address of message on the stack
call puts ; Call clib function for displaying strings
add esp,4 ; Clean stack by adjusting ESP back 4 bytes
;;; Everything after this is boilerplate; use it for all ordinary apps!
pop edi ; Restore saved registers
pop esi
pop ebx
mov esp,ebp ; Destroy stack frame before returning
pop ebp
ret ; Return control to Linux
The suggested nasm and linker commands are
nasm -f elf -g -F stabs eatclib.asm
gcc eatclib.o -o eatclib
The closest approximation to a solution that I've found is here: Call C functions from 64-bit assembly.
I have tried converting the extended registers to rbp
, rsp
, etc.; adjusting the stack pointer by 8 bits instead of four after the call to puts
, and adjusting the makefile using:
nasm -f elf64 -g -F dwarf eatclib.asm
and
gcc eatclib.o -o eatclib -m64 -static
but got a segmentation fault.
My understanding of the C calling convention is still nebulous/tenuous enough that I didn't really delve far into trying to find the fault when I tried to follow along with the gdb debugger (the problems are both only being somewhat familiar with 32-bit conventions and not much with C). This book is intended to be an introductory one for newbie assembly programmers with little to no C-background.
Trying in the other direction, a simple C program that uses puts with a string produces the file (using the gcc -S
option) of:
.file "SayHello.c"
.text
.section .rodata
.align 8
.LC0:
.string "This is based on an example from C Primer Plus"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
The compiled code here ran (and I understand most of this except for the .cfi
directives, what .rodata
signifies, and why gas stuck that @PLT
on puts
.) This is of course gas syntax and the text I'm using features NASM mostly.
I've also tried using the loader instead of gcc with a line found on page 89 of Professional Assembly Language (by Richard Blum)
ld -dynamic-linker /lib/ld-linux.so.2 -o eatclib -lc eatclib.o
but end up with pretty typical linker errors to what I've encountered before:
ld: i386 architecture of input file `eatclib.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400250
makefile:2: recipe for target 'eatclib' failed
I've tried passing the -m32
option to the linker to no avail too.
Anyhow, I'm looking for suggestions that will work. In my search I've seen examples where people suggest using apt-get
and installing new (actually old) libraries but these seem to effectively gut the 64-bit stuff system wide -- which looks pretty drastic when I have been able to run previous 32 bit code with the -melf_i386
option passed to the linker).
Upvotes: 3
Views: 2235
Reputation: 151
Jester's suggestion to install gcc-multilib and then use the gcc -m32 argument worked with the 32-bit code. (This is definitely a duplicate from elsewhere on stackoverflow... saw the suggestion somewhere yesterday but didn't trust the overhaul of gcc that it seemed to require.)
Upvotes: 2
Reputation: 93034
To assemble and link 64 bit nasm code that uses the libc, type:
nasm -f elf64 program.asm
gcc -o program program.o
Depending on your system and programming style, you might need to pass -no-pie
to gcc
so it accepts position-dependent code.
It is not recommended to invoke the linker directly when linking in the libc because there is no stable way to pull in the C runtime initialisation code manually. Merely passing -lc
to the linker is insufficient to get the libc to work correctly.
Note the elf64
to make nasm emit a 64 bit object file. gcc works with 64 bit code on a 64 bit platform unless told otherwise so no other options are needed. You may want to add debug symbols, but keep in mind that stabs is an obsolete format. You might want this:
nasm -f elf64 -gdwarf program.asm
Mechanically converting source code is more or less possible. Keep the following differences in mind:
rax
, rcx
, rdx
, rbx
, rsp
, rbp
, rsi
,
and rdi
.r8
to r15
exist. Their 32 bit, 16 bit, and 8 bit versions are called r8d
, r8w
, r8b`, etc.rdi
, rsi
, rdx
, rcx
, r8
, and r9
. The registers rbx
, rbp
, rsp
, r12
, r13
, r14
, and r15
must be preserved by the callee, all other general purpose registers may be overwritten freely. Floating point arguments are passed and returned in SSE registers. If there are too many arguments, extra arguments are passed on the stack.call
instruction pushes 8 bytes and the push rbp
instruction in the function prologue pushes another 8 bytes, this is the case by default unless you manually allocate space on the stack. Just remember to do so in increments of 16 bytes.Here is the code from your question translated to 64 bit code. All changes have been marked:
SECTION .data
EatMsg: db "Eat at Joe's!",0
SECTION .text
extern puts
global main
main: ; function entry (stack alignment: 16 bytes + 8 bytes)
push rbp ; setup...
mov rbp, rsp ; the stack frame (stack now aligned to 16 bytes + 0 bytes)
; since we have so many registers, I only preserve those
; I want to use and that must be preserved, of which there
; are none in this program.
lea rdi, [rel EatMsg] ; load address of EatMsg into rdi
call puts ; call puts
; no cleanup needed as we have not pushed anything
pop rbp ; restore rbp
ret ; return
Note that I left out a bunch of boilerplate. lea
is used to load the address of EatMsg
instead of the simpler mov rdi, EatMsg
so your program is position-independent. If you don't know what this means, you can safely ignore this tidbit until later.
Lastly, you can generally ignore cfi directives. They add metadata for exception handling which is only important when your code calls C++ functions that throw exceptions. They do not change the behaviour of the code itself.
Upvotes: 6