C - external assembly function returning different results with the same input

Question

I have a program in C which uses a NASM function. Here is the code of the C program:

#include 
#include 
#include 

extern float hyp(float a); // supposed to calculate 1/(2 - a) + 6

void test(float (*f)(float)){
    printf("%f %f %f
", f(2.1), f(2.1), f(2.1));
}

void main(int argc, char** argv){
    for(int i = 1; i < argc; i++){
        if(!strcmp(argv[i], "calculate")){
            test(hyp);
        }
    }
}

And here is the NASM function:

section .data
    a dd 1.0
    b dd 2.0
    c dd 6.0

section .text
global hyp
hyp:
    push ebp
    mov ebp, esp
    finit

    fld dword[b]
    fsub dword[ebp + 8]
    fstp dword[b]
    fld dword[a]
    fdiv dword[b]
    fadd dword[c]

    mov esp, ebp
    pop ebp
    ret

These programs were linked in Linux with gcc and nasm. Here is the Makefile:

all: project clean
main.o: main.c
    gcc -c main.c -o main.o -m32 -std=c99
hyp.o: hyp.asm
    nasm -f elf32 -o hyp.o hyp.asm -D UNIX
project: main.o hyp.o
    gcc -o project main.o hyp.o -m32 -lm
clean:
    rm -rf *.o

When the program is run, it outputs this:

5.767442 5.545455 -4.000010

The last number is correct. My question is: why do these results differ even though the input is the same?

zwol · Accepted Answer

The bug is that you do this:

fstp dword[b]

That overwrites the value of b, so the next time you call the function, the constant is wrong. In the overall program's output, this shows up as the rightmost evaluation being the only correct one, because the compiler evaluated the arguments to printf from right to left. (It is allowed to evaluate the arguments to a multi-argument function in any order it wants.)

You should have used the .rodata section for your constants; then the program would crash rather than overwrite a constant.

You can avoid needing to store and reload an intermediate value by using fdivr instead of fdiv.

hyp:
    fld     DWORD PTR [b]
    fsub    DWORD PTR [esp+4]
    fdivr   DWORD PTR [a]
    fadd    DWORD PTR [c]
    ret

Alternatively, do what a Forth programmer would do, and load the constant 1 before everything else, so it's in ST(1) when it needs to be. This allows you to use fld1 instead of putting 1.0 in memory.

hyp:
    fld1
    fld     DWORD PTR [b]
    fsub    DWORD PTR [esp+4]
    fdivp
    fadd    DWORD PTR [c]
    ret

You do not need to issue a finit, because the ABI guarantees that this was already done during process startup. You do not need to set up EBP for this function, as it does not make any function calls itself (the jargon term for this is "leaf procedure"), nor does it need any scratch space on the stack.

Another alternative, if you have a modern CPU, is to use the newer SSE2 instructions. That gives you normal registers instead of an operand stack, and also means the calculations are all actually done in float instead of 80-bit extended, which can be very important — some complex numerical algorithms will malfunction if they have more floating-point precision than the designers expected to have. Because you're using the 32-bit ELF ABI, though, the return value still needs to wind up in ST(0), and there's no direct move instructions between SSE and x87 registers, you have to go through memory. I don't know how to write SSE2 instructions in Intel syntax, sorry.

hyp:
    subl    $4, %esp
    movss   b, %xmm1
    subss   8(%esp), %xmm1
    movss   a, %xmm0
    divss   %xmm1, %xmm0
    addss   c, %xmm0
    movss   %xmm0, (%esp)
    flds    (%esp)
    addl    $4, %esp
    ret

In the 64-bit ELF ABI, with floating-point return values in XMM0 (and argument passing in registers by default as well), that would just be

hyp:
    movss   b(%rip), %xmm1
    subss   %xmm0, %xmm1
    movss   a(%rip), %xmm0
    divss   %xmm1, %xmm0
    addss   c(%rip), %xmm0
    ret

C - external assembly function returning different results with the same input

Answers (1)

Related Questions