arsenm
arsenm

Reputation: 2933

Ensure compiler always use SSE sqrt instruction

I'm trying to get GCC (or clang) to consistently use the SSE instruction for sqrt instead of the math library function for a computationally intensive scientific application. I've tried a variety of GCCs on various 32 and 64 bit OS X and Linux systems. I'm making sure to enable sse with -mfpmath=sse (and -march=core2 to satisfy GCCs requirement to use -mfpmath=sse on 32 bit). I'm also using -O3. Depending on the GCC or clang version, the generated assembly doesn't consistently use SSE's sqrtss. In some versions of GCC, all the sqrts use the instruction. In others, there is mixed usage of sqrtss and calling the math library function. Is there a way to give a hint or force the compiler to only use the SSE instruction?

Upvotes: 4

Views: 2031

Answers (2)

Jens Gustedt
Jens Gustedt

Reputation: 78963

You should be carefull in using that, you probably know that it has less precicision. That will be the reason that gcc doesn't use it systematically.

There is a trick that is even mentionned in INTEL's SSE manual (I hope that I remember correctly). The result of sqrtss is only one Heron iteration away from the target. Maybe that gcc is sometimes able to inline that surrounding brief iteration at some point (versions) and for others it doesn't.

You could use the builtin as MSN says, but you should definitively look up the specs on INTEL's web site to know what you are trading.

Upvotes: 0

MSN
MSN

Reputation: 54634

Use the sqrtss intrinsic __builtin_ia32_sqrtss?

Upvotes: 4

Related Questions