Reputation: 101
A little curiosity I found; GCC seems to generate the following code when I have a lot of optimization flags on:
00000000004019ae: test %si,%si
00000000004019b1: movups %xmm0,%xmm0
00000000004019b4: je 0x401f40 <main(int, char**)+1904>
Question: what purpose does the second instruction serve? It doesn't look like it /does/ anything; so, is it some optimization to align the program in the instruction cache? Or is it something with out-of-order execution? (I'm compiling with -mtune=native
on a Nehalem if that helps :D).
Nothing urgent, of course, just curious.
Upvotes: 4
Views: 491
Reputation: 4650
Adding to the hypothesis proposed by Evgeny Kluev, other possibilities (in no particular order) are that (a) it's a compiler optimiser bug, (b) movups
is inserted to break a dependency or (c) it is inserted for the purpose of code alignment.
Upvotes: 2
Reputation: 24647
Possibly xmm0
contains a result of some calculations, done in integer domain (with integer SSE instruction). And the next instruction using xmm0
is expected to be in floating point domain (floating point SSE instruction).
Nehalem may perform this next instruction faster if xmm0
is migrated to floating point domain with instruction like movaps
or movups
. And it may be beneficial to perform this migration prior to conditional jump instruction. In this case migration is done only once. If no movups
instruction used, migration may be done twice (automatically, by the first FP instruction on this register), first time speculatively, on mispredicted branch, and second time - on the correct branch.
It seems, compiler noticed, that it is better to optimize calculation dependency chains, than to optimize for code size and execution resources.
Upvotes: 6