Reputation: 6790
At my university we were just introduced to IA32 SSE. What I am trying to do is to add two vectors (They call it a "packed value", it means that the vector contains four 32-bit single precision floating point numbers. One verctor's size is 128 bit.) Here's what I am trying to do:
%xmm0 | 5.5 | 1.2 | 2.4 | 7.0 |
%xmm1 | 3.0 | 1.5 | 3.5 | 2.2 |
| | | |
+ + + +
| | | |
V V V V
%xmm0 | 8.5 | 2.7 | 5.9 | 9.2 |
However, on the slides they only show the following code snippet which I simply don't get to work:
# %eax and %ebx contain the addresses of the two vectors that are to be added
movups (%eax), %xmm0
movups (%ebx), %xmm1
addps %xmm1, %xmm0
movups %xmm0, result
This raises two questions:
1. How do I even create these vectors in the first place and how do I make %eax and %ebx point to them?
2. How do I print the result in order to check whether the operation was successful or not?
Here's what I tried. The following code compiles and does not crash when I run it. However, there's no output at all... :/
.data
x0: .float 7.0
x1: .float 2.4
x2: .float 1.2
x3: .float 5.5
y0: .float 2.2
y1: .float 3.5
y2: .float 1.5
y3: .float 3.0
result: .float 0
intout: .string "Result: %f.\n"
.text
.global main
main:
pushl x3
pushl x2
pushl x1
pushl x0
movl %esp, %eax
pushl y3
pushl y2
pushl y1
pushl y0
movl %esp, %ebx
movups (%eax), %xmm0
movups (%ebx), %xmm1
addps %xmm1, %xmm0
movups %xmm0, result
pushl result
pushl $intout
call printf
addl $40, %esp
movl $1, %eax
int $0x80
Upvotes: 3
Views: 950
Reputation: 126203
You seem to be confused about how to declare a label on multiple data items, and how to load a label into a register. A label is just an address -- a point in memory -- without any size or anything else associated with it. Things after the label are in consecutive addresses in memory. So you declare a label referring to a vector as:
x:
.float 7.0
.float 2.4
.float 1.2
.float 5.5
Now you can load that address into a register with a simple move, then use the register to load the vector:
movl $x, %eax
movups (%eax), %xmm0
Alternately, you can load directly from the label
movups x, %xmm0
Upvotes: 2
Reputation: 9377
The %f
specifier for printf
indicates a double argument, not a float argument. As such, you need to covert the single-floats in your result vector and move them to the stack. This is how I would do that:
.section ".rodata"
fmt: .string "%f %f %f %f\n"
.align 16
vec1:
.float 7.0
.float 2.4
.float 1.2
.float 5.5
vec2:
.float 2.2
.float 3.5
.float 1.5
.float 3.0
.data
.align 16
result:
.float 0.0
.float 0.0
.float 0.0
.float 0.0
.text
.globl main
main:
movl %esp, %ebp
andl $-16, %esp # align stack
movaps vec1, %xmm0
movaps vec2, %xmm1
addps %xmm1, %xmm0
movaps %xmm0, result
subl $36, %esp
movl $fmt, (%esp)
movss result, %xmm0
cvtss2sd %xmm0, %xmm0
movsd %xmm0, 4(%esp)
movss result+4, %xmm0
cvtss2sd %xmm0, %xmm0
movsd %xmm0, 12(%esp)
movss result+8, %xmm0
cvtss2sd %xmm0, %xmm0
movsd %xmm0, 20(%esp)
movss result+12, %xmm0
cvtss2sd %xmm0, %xmm0
movsd %xmm0, 28(%esp)
call printf
addl $36, %esp
xorl %eax, %eax
movl %ebp, %esp
ret
Upvotes: 2