svm
svm

Reputation: 161

Assembly code x86

so im a total noob at assembly code and reading them as well

so i have a simple c code

void saxpy()
{
  for(int i = 0; i < ARRAY_SIZE; i++) {
  float product = a*x[i];
  z[i] = product + y[i];
}
}

and the equivalent assembly code when compiled with

gcc -std=c99 -O3 -fno-tree-vectorize -S code.c -o code-O3.s 

gives me the follows asssembly code

saxpy:
.LFB0:
.cfi_startproc
movss   a(%rip), %xmm1
xorl    %eax, %eax
.p2align 4,,10
.p2align 3
.L3:
movss   x(%rax), %xmm0
addq    $4, %rax
mulss   %xmm1, %xmm0
addss   y-4(%rax), %xmm0
movss   %xmm0, z-4(%rax)
cmpq    $262144, %rax
jne .L3
rep ret
.cfi_endproc

i do understand that loop unrolling has taken place but im not able to understand the intention and idea behind

addq    $4, %rax
mulss   %xmm1, %xmm0
addss   y-4(%rax), %xmm0
movss   %xmm0, z-4(%rax)

Can someone explain, the usage of 4, and what does the statements mean y-4(%rax)

Upvotes: 0

Views: 388

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 365517

x, y, and z are global arrays. You left out the end of the listing where the symbols are declared.

I put your code on godbolt for you, with the necessary globals defined (and fixed the indenting). Look at the bottom.

BTW, there's no unrolling going on here. There's one each scalar single-precision mul and add in the loop. Try with -funroll-loops to see it unroll.

With -march=haswell, gcc will use an FMA instruction. If you un-cripple the compiler by leaving out -fno-tree-vectorize, and #define ARRAY_SIZE is small, like 100, it fully unrolls the loop with mostly 32byte FMA ymm instructions, ending with some 16byte FMA xmm.

Also, what is the need to add an immediate value 4 to rax register. which is done as per the statement "addq $4, %rax"

The loop increments a pointer by 4 bytes, instead of using a scaled-index addressing mode.


Look at the links on https://stackoverflow.com/questions/tagged/x86. Also, single-stepping through code with a debugger is often a good way to make sure you understand what it's doing.

Upvotes: 1

Related Questions