muodostus
muodostus

Reputation: 151

Trying to call C function from glibc from Assembly program (64 bit)

I've been working through Assembly Language Step by Step: Third Edition and am in the final chapter "Heading out to C". I'm trying to get a consistent method of converting the 32-bit code which calls the C library (glibc) function puts on my 64-bit Ubuntu system. (I would like to follow through the last 50 pages of the text which presumably head deeper into the C [more geeky puns] but from an assembly base written with 32-bit code). The code is:

SECTION .data           ; Section containing initialised data
EatMsg: db "Eat at Joe's!",0

SECTION .text           ; Section containing code
extern puts             ; Simple "put string" routine from clib
global main             ; Required so linker can find entry point
main:
        push ebp        ; Set up stack frame for debugger
        mov ebp,esp
        push ebx        ; Must preserve ebp, ebx, esi, & edi
        push esi
        push edi

;;; Everything before this is boilerplate; use it for all ordinary apps!
        push EatMsg     ; Push address of message on the stack
        call puts       ; Call clib function for displaying strings
        add esp,4       ; Clean stack by adjusting ESP back 4 bytes

;;; Everything after this is boilerplate; use it for all ordinary apps!
        pop edi         ; Restore saved registers
        pop esi
        pop ebx
        mov esp,ebp     ; Destroy stack frame before returning
        pop ebp
        ret             ; Return control to Linux

The suggested nasm and linker commands are

nasm -f elf -g -F stabs eatclib.asm
gcc eatclib.o -o eatclib

The closest approximation to a solution that I've found is here: Call C functions from 64-bit assembly.

I have tried converting the extended registers to rbp, rsp, etc.; adjusting the stack pointer by 8 bits instead of four after the call to puts, and adjusting the makefile using:

nasm -f elf64 -g -F dwarf eatclib.asm

and

gcc eatclib.o -o eatclib -m64 -static

but got a segmentation fault.

My understanding of the C calling convention is still nebulous/tenuous enough that I didn't really delve far into trying to find the fault when I tried to follow along with the gdb debugger (the problems are both only being somewhat familiar with 32-bit conventions and not much with C). This book is intended to be an introductory one for newbie assembly programmers with little to no C-background.

Trying in the other direction, a simple C program that uses puts with a string produces the file (using the gcc -S option) of:

.file   "SayHello.c"
        .text
        .section        .rodata
        .align 8
.LC0:
        .string "This is based on an example from C Primer Plus"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        leaq    .LC0(%rip), %rdi
        call    puts@PLT
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret

The compiled code here ran (and I understand most of this except for the .cfi directives, what .rodata signifies, and why gas stuck that @PLT on puts.) This is of course gas syntax and the text I'm using features NASM mostly.

I've also tried using the loader instead of gcc with a line found on page 89 of Professional Assembly Language (by Richard Blum)

ld -dynamic-linker /lib/ld-linux.so.2 -o eatclib -lc eatclib.o

but end up with pretty typical linker errors to what I've encountered before:

ld: i386 architecture of input file `eatclib.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400250
makefile:2: recipe for target 'eatclib' failed

I've tried passing the -m32 option to the linker to no avail too.

Anyhow, I'm looking for suggestions that will work. In my search I've seen examples where people suggest using apt-get and installing new (actually old) libraries but these seem to effectively gut the 64-bit stuff system wide -- which looks pretty drastic when I have been able to run previous 32 bit code with the -melf_i386 option passed to the linker).

Upvotes: 3

Views: 2235

Answers (2)

muodostus
muodostus

Reputation: 151

Jester's suggestion to install gcc-multilib and then use the gcc -m32 argument worked with the 32-bit code. (This is definitely a duplicate from elsewhere on stackoverflow... saw the suggestion somewhere yesterday but didn't trust the overhaul of gcc that it seemed to require.)

Upvotes: 2

fuz
fuz

Reputation: 93034

To assemble and link 64 bit nasm code that uses the libc, type:

nasm -f elf64 program.asm
gcc -o program program.o

Depending on your system and programming style, you might need to pass -no-pie to gcc so it accepts position-dependent code.

It is not recommended to invoke the linker directly when linking in the libc because there is no stable way to pull in the C runtime initialisation code manually. Merely passing -lc to the linker is insufficient to get the libc to work correctly.

Note the elf64 to make nasm emit a 64 bit object file. gcc works with 64 bit code on a 64 bit platform unless told otherwise so no other options are needed. You may want to add debug symbols, but keep in mind that stabs is an obsolete format. You might want this:

nasm -f elf64 -gdwarf program.asm

Mechanically converting source code is more or less possible. Keep the following differences in mind:

  • pointers and stack slots are 8 bytes long and all general purpose registers have been extended to 8 bytes; the 64 bit variants of the first 8 registers are called rax, rcx, rdx, rbx, rsp, rbp, rsi, and rdi.
  • 8 new general purpose registers r8 to r15 exist. Their 32 bit, 16 bit, and 8 bit versions are called r8d, r8w, r8b`, etc.
  • SSE instructions are used for floating point instead of x87 instructions
  • 64 bit code generally obeys a different calling convention than 32 bit code. On UNIX-like systems such as Linux, the amd64 SysV ABI is generally used. In this ABI, scalar arguments are passed from left to right in the registers rdi, rsi, rdx, rcx, r8, and r9. The registers rbx, rbp, rsp, r12, r13, r14, and r15 must be preserved by the callee, all other general purpose registers may be overwritten freely. Floating point arguments are passed and returned in SSE registers. If there are too many arguments, extra arguments are passed on the stack.
  • The SysV ABI demands that the stack pointer is aligned to 16 bytes on function call. Since the call instruction pushes 8 bytes and the push rbp instruction in the function prologue pushes another 8 bytes, this is the case by default unless you manually allocate space on the stack. Just remember to do so in increments of 16 bytes.

Here is the code from your question translated to 64 bit code. All changes have been marked:

        SECTION .data
EatMsg: db "Eat at Joe's!",0

        SECTION .text
        extern puts
        global main
main:                           ; function entry (stack alignment: 16 bytes + 8 bytes)
        push rbp                ; setup...
        mov rbp, rsp            ; the stack frame (stack now aligned to 16 bytes + 0 bytes)

                                ; since we have so many registers, I only preserve those
                                ; I want to use and that must be preserved, of which there
                                ; are none in this program.

        lea rdi, [rel EatMsg]   ; load address of EatMsg into rdi
        call puts               ; call puts
                                ; no cleanup needed as we have not pushed anything

        pop rbp                 ; restore rbp
        ret                     ; return

Note that I left out a bunch of boilerplate. lea is used to load the address of EatMsg instead of the simpler mov rdi, EatMsg so your program is position-independent. If you don't know what this means, you can safely ignore this tidbit until later.

Lastly, you can generally ignore cfi directives. They add metadata for exception handling which is only important when your code calls C++ functions that throw exceptions. They do not change the behaviour of the code itself.

Upvotes: 6

Related Questions