Reputation: 541
I have a C++ program which is compiled under gcc (gcc version 4.5.1) with the -O3 flag. I'm thinking about whether or not it would be worthwhile making an SSE2 version of this program (or at least, the busiest of it). However, I'm worried that the compiler has already done this through automatic vectorization.
Question: How do I determine (a) whether or not my program is using SSE/SSE2 and (b) how much time is spent using SSE/SSE2 (i.e. profiling)?
Upvotes: 2
Views: 321
Reputation: 471519
The easiest way to tell if you are gaining any benefit from compiler vectorization is to run the code with and without the -ftree-vectorize
flag and compare the results.
-O3
will automatically enable that option. So you might want to try it under -O2
instead.
To see which loops were vectorized, which were not, and why, you can add the -ftree-vectorizer-verbose
option.
The last option, of course, is to look at the assembly. It's very easy to identify vectorized code in assembly.
Upvotes: 1