dlkulp
dlkulp

Reputation: 2264

NASM x87 FILD instruction

I was reading this and I noticed that their example shows fld loading values into different locations (st0, st1, then back to st0) without specifying where where to load to. I'm assuming fild works in a similar fashion as it's just the means of loading an integer (or that's my understanding anyway) but I could be wrong. So my question is this: where does fld, and more specifically fild load values to? Is there a parameter to specify which fpu register to use, or does it just loop through the 8, or is there some totally different way that I'm missing?

The code I'm specifically working with is trying to multiply 3 numbers together. I'm assuming that one way would be to load into st0, then load into st1, then load into st2, then fmul st0 and st1 (result in st0), then fmul st0 and st2. The code is as follows:

mov dword [ebp-8], 4
mov dword [ebp-12], ecx
fild dword [ebp-4]
fild dword [ebp-8]
fild dword [ebp-12]
fmul st0, st1
fmul st0, st2
fistp dword [ebp-8]
mov eax, dword [ebp-8]

ecx = 5 and [ebp-4] = 5

This code crashes, using OllyDbg I see that there was an access violation at 00000069 but that is not currently contained in any of the registers.

So yea, is there a way to specify where fild loads values, is there a nice way of figuring out where they should go, if I'm running this in a loop does that change anything?

--EDIT 3-- Mostly fixed, sorta. One big thing is that fmul DOESN'T push the value onto st0, it just overwrites whatever is in st0. New code:

mov dword [ebp-8], ecx
fild dword [ebp-4]
fild dword [four]
fild dword [ebp-8]
fmul st1
fmul st2
fistp dword [ebp-12]
mov eax, dword [ebp-12]

this loops and decrements until ecx == 2 and then trying to fild 2 and 1 gives the same bad -NAN FFFF C0000000 00000000 as earlier. I'm not sure how 3 is different from 2 or 1 (other than being smaller) but that's when it starts to give the bad values. I should note that the ERROR_MOD_NOT_FOUND was thrown though I'm not really sure what the means because all of the cpu and fpu registers should be accessible.

--EDIT 2-- Fixed pop stuff as Parham Alvani showed with documentation stuff:

mov dword [ebp-8], ecx      ; moves eax (starts as 5) into local var (fild can't take a cpu register)
fild dword [ebp-4]          ; starts as 5, moves down with outer loop
fild dword [four]           ; the integer 4
fild dword [ebp-8]          ; starts as 5, moves down with inner loop
fmul st0, st1               ; 0 := 0 * i
fmul st0, st3
fistp dword [ebp-12]        ; move st0 to local var
mov eax, dword [ebp-12]     ; move local var to eax

This puts pushes 5 and then pushes bad -NAN FFFF C0000000 00000000 twice. fmul doesn't seem to be doing anything (possibly because of the bad values). Is there a better way to load values? It seems like I'm doing something wrong with fild, but according to the examples provided in the first link, and as defined here, fild just pushes onto st0 whatever you give it.

--EDIT 1-- As Jester suggested, I'm now popping off the fpu stack every loop:

mov dword [ebp-8], 4
mov dword [ebp-12], ecx
fild dword [ebp-4]
fild dword [ebp-8]
fild dword [ebp-12]
fmul st0, st1
fmul st0, st2
fstp st2
fstp st1
fistp dword [ebp-8]
mov eax, dword [ebp-8]

This code still crashes. Access violation at 00000009, st0-4 are 0's, st5 = 100, st6 = 4, st7 = 4

Upvotes: 0

Views: 3650

Answers (1)

George Koehler
George Koehler

Reputation: 1703

Your original code is correct. fld and fild push the loaded value onto the x87 stack. This push always puts the value in st0, moves the old value of st0 to st1, the old st1 to st2, and so on.

fild dword [ebp-4]   ; st0 = x
fild dword [ebp-8]   ; st1 = x, st2 = y
fild dword [ebp-12]  ; st2 = x, st1 = y, st0 = z
fmul st0, st1        ; st2 = x, st1 = y, st0 = z * y
fmul st0, st2        ; st2 = x, st1 = y, st0 = z * y * x
fistp dword [ebp-8]  ; st1 = x, st0 = y

Your code might crash because ebp points to a bad place, or because there is mistake in another part of your code, not the part that you posted. You don't say which instruction crashed. At the moment of the crash, the program counter (pc) points to the crashing instruction.

I put your code in a short program, and ran it successfully in gdb on my OpenBSD/amd64 machine.

section .data
    dd 0
    dd 0
    dd 5
space:

section .text
global main
main:
    mov ebp, space
    mov ecx, 5
    mov dword [ebp-8], 4
    mov dword [ebp-12], ecx
    fild dword [ebp-4]
    fild dword [ebp-8]
    fild dword [ebp-12]
    fmul st0, st1
    fmul st0, st2
    fistp dword [ebp-8]
    mov eax, dword [ebp-8]
    int 3

To assemble and run:

$ nasm -felf64 fmul3.s && gcc -nopie -o fmul3 fmul3.o 
$ gdb fmul3
...
(gdb) run
...
Program received signal SIGTRAP, Trace/breakpoint trap.
...
(gdb) x/3wd (char *)&space - 12
0x601000 <__data_start>:        5       100     5
(gdb) print (int)$rax
$1 = 100

The program successfully multiplied 5 * 4 * 5 = 100.

Upvotes: 1

Related Questions