x86 Function Calling Convention general structure for the layman?

Question

I am just getting into assembly (x86-64 in NASM on OSX), and now want to see how to implement functions in it.

tl;dr How do you implement a semi-realistic / practical set of functions in assembly using the x86 calling conventions? And how would you break it down and explain the parts of that set of functions, and how it works in assembly, to a layman?

I have found a couple of decent resources on them, but none of them give a realistic example that you can really sink your teeth into.

Here is a quick analogy. It's like if you were learning the dynamic nature of ruby and the tutorial just says this to you:

Run this in your console:

> puts "Hello World"

Now you can see how powerful ruby's dynamic nature truly can be.

It's like WHAT. What are you talking about, I have no idea from that example how to do anything.

The same is happening with Assembly and calling conventions. There are like no examples on the internet that provide any meaningful introduction to writing your own function in assembly. The resources I've dug into so far include:

I have also tried seeing the output of C compilers like gcc and clang on some simple C code, but it looks like they optimize out functions most of the time (not sure yet).

So my question is, how do you implement a semi-realistic / practical set of functions in assembly using the x86 calling conventions? How would you break it down and explain the parts of that set of functions, and how it works in assembly, to a layman?

I don't have a good example problem to try to implement yet, but maybe it is just a simple write function that you can pass any string to and it prints it to stdout. An equivalent in JavaScript would look like:

function write(string) {
  console.log(string);
}

function main() {
  write('Hello world');
}

If that's too simple maybe there is a better example. Any ideas?

David C. Rankin · Accepted Answer

Well, first of all, you need a good example to sink your teeth into and to separate when and what calling conventions apply and when they don't. First lets talk about functions in NASM itself, which if I've distilled your question correctly, that is what your goal is: "a semi-realistic / practical set of functions".

(presuming Linux is the OS - there are differences between Linux/FreeBSD/MIPS/etc.) In NASM, all you need to do to define a function is to provide a label, the function code, and a ret (return).

Calling an assembly function within an assembly program itself requires no special calling convention per-se. You are responsible for providing the proper data/addresses in the proper location as required by your function. It can be anywhere, a register, the stack, a memory address, etc. When you are done processing data in your function, you simply end with ret (a return). So the basic syntax for an assembly function definition in NASM is:

label:
    suff...
    ret

You then setup your registers as required and call your function in your code with:

call  label

The state of the registers on return from the function are however the function left them. So you must also take care to provide temporary storage for any data or registers that will be clobbered by your function call.

Calling conventions apply where you interface with an external language (e.g. calling libc function from within assembly, or calling assembly routines/functions from within C). Here you have separate calling conventions for x86 and x86_64. That in itself is an entire different discussion. The registers involved are as explained in the comment with additional subtleties regarding what addresses are preserved, which get clobbered, and what the responsibilities of each the caller are and what the callee is responsible for. If that is your inquiry drop a comment and I'll see if I can point you in the right direction (it is no short subject).

Below is hopefully an example you can "sink your teeth into" concerning the basic building blocks needed to build a semi-realistic / practical set of functions for use in assembly with NASM. In addition to the basic function syntax outlined above, NASM also provides simple macro capabilities that are very good to help augment or automate many simple tasks that you would otherwise write a function for. (the same rules apply - you are responsible for setting the data/registers before the call).

Below is your basic x86_64 "hello world" implemented in straight assembly, then again through function calls, and supplemented by macros for formatting, etc. These are the basic tools you have to work with in building your set of functions. Let me know if you have questions:

; macro to print all or part of a string
; takes two arguments:
;  1. address of string
;  2. character to write to stdout
%macro  strn    2
        mov     rax, 1
        mov     rdi, 1
        mov     rsi, %1
        mov     rdx, %2
        syscall
%endmacro

section .data
    onln times 8 db 0xa     ; 8 newlines
    tab times 8 db 0x20     ; 8 spaces
    string1 db  0xa, "  Hello StackOverflow!!!", 0xa, 0xa, 0
    string2 db  "Hello Plain Assembly", 0  ; no pad or newlines

section .text
    global _start

    _start:
        ; first print sring1 with no functions and no macros
        ; calculate the length of string
        mov     rdi, string1        ; string1 to destination index
        xor     rcx, rcx            ; zero rcx
        not     rcx                 ; set rcx = -1
        xor     al,al               ; zero the al register (initialize to NUL)
        cld                         ; clear the direction flag
        repnz   scasb               ; get the string length (dec rcx through NUL)
        not     rcx                 ; rev all bits of negative results in absolute value
        dec     rcx                 ; -1 to skip the null-terminator, rcx contains length
        mov     rdx, rcx            ; put length in rdx
        ; write string to stdout
        mov     rsi, string1        ; string1 to source index
        mov     rax, 1              ; set write to command
        mov     rdi, rax            ; set destination index to rax (stdout)
        syscall                     ; call kernel

        ; now print string2 using 'strprn'
        mov     rdi, string2        ; put string2 in rdi (as need by strprn)
        call    strprn              ; call function strprn

        ; now let's setup a bit of formatting for string 2 & print it again
        strn    onln, 2             ; macro to output 2 newlines from 'onln' (string2 has none)
        strn    tab, 2              ; macro to indent by 2 chars (1st 2 spaces in tab)
        mov     rdi, string2        ; put string2 in rdi (as need by strprn)
        call    strprn              ; call function strprn
        strn    onln, 2             ; macro to output 2 newlines from 'onln' after sting2

        ; exit 
        xor     rdi,rdi             ; zero rdi (rdi hold return value)
        mov     rax, 0x3c           ; set syscall number to 60 (0x3c hex)
        syscall                     ; call kernel

; Two functions below:
; 'strsz'  (basic strlen())
; 'strprn' (basic puts())

; szstr computes the lenght of a string.
; rdi - string address
; rdx - contains string length (returned)
section .text
        strsz:
                xor     rcx, rcx                ; zero rcx
                not     rcx                     ; set rcx = -1 (uses bitwise id: ~x = -x-1)
                xor     al,al                   ; zero the al register (initialize to NUL)
                cld                             ; clear the direction flag
                repnz scasb                     ; get the string length (dec rcx through NUL)
                not     rcx                     ; rev all bits of negative -> absolute value
                dec     rcx                     ; -1 to skip the null-term, rcx contains length
                mov     rdx, rcx                ; size returned in rdx, ready to call write
                ret

; strprn writes a string to the file descriptor.
; rdi - string address
; rdx - contains string length
section .text
        strprn:
                push    rdi                     ; push string address onto stack
                call    strsz                   ; call strsz to get length
                pop     rsi                     ; pop string to rsi (source index)
                mov     rax, 0x1                ; put write/stdout number in rax (both 1)
                mov     rdi, rax                ; set destination index to rax (stdout)
                syscall                         ; call kernel
                ret

; compile & build
;
;  nasm -f elf64 -o hello-stack_64_wfns.o hello-stack_64_wfns.asm
;  ld  -o hello-stack_64_wfns hello-stack_64_wfns.o

output:

$ ./bin/hello-stack_64_wfns

  Hello StackOverflow!!!

Hello Plain Assembly

  Hello Plain Assembly

x86 Function Calling Convention general structure for the layman?

Answers (1)

Related Questions