Reputation: 1110
Given the following program:
#include "emmintrin.h"
int main(int argc, char *argv[])
{
volatile __m128i x = _mm_set_epi64x(1, 0);
return 0;
}
I can get the assembly using clang -O -S test.c
(only listing the interesting part):
...
movl $1, %eax
movd %rax, %xmm0
pslldq $8, %xmm0 # xmm0 = zero,zero,zero,zero,zero,zero,zero,zero,xmm0[0,1,2,3,4,5,6,7]
...
According to the manual of _mm_set_epi64x, %xmm0
should be [0, 1, 0, 0]
, with each element being an integer (32 bits).
However, according to the comment, %xmm0
holds [0, 0, 0, 1]
. I don't think endianness is relevant here, for I am only looking at a register.
I suspect that it's sth related to the notation used by clang assembly comment, but I can't find any useful info on it on the internet.
== Edit:
Filed a bug to clang.
Upvotes: 1
Views: 172
Reputation: 365332
The comment appears to be describing the operation of pslldq
in terms of the previous contents of xmm0
(even though those are known at compile time).
It seems to be in reverse order from the usual high-element-first ([ 3 2 1 0 ]
) that _mm_set
uses, and that makes "left" shifts make sense.
It's the byte-order you'd get in memory if you stored the vector.
I forget if that's typical for clang, and I don't have time right now to check another example.
Upvotes: 1
Reputation: 93117
The clang code loads the value in two steps. First, the value 1 is loaded into the lower 64 bits of the register. Then the entire thing is left shifted by 8 binary places so the value 1 ends up in the high 64 bits just as your code specifies.
Upvotes: 1