Reputation: 5789
So, I have C++ code with this loop:
for(i=0;i<(m-1);i++) N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;
All the quantitiy involved are int
's. From GCC's vectorization report I get:
babar.cpp:233: note: ===== analyze_loop_nest =====
babar.cpp:233: note: === vect_analyze_loop_form ===
babar.cpp:233: note: === get_loop_niters ===
babar.cpp:233: note: not vectorized: number of iterations cannot be computed.
babar.cpp:233: note: bad loop form.
I wondering why 'the number of iteration cannot be computed'!? FWIW, m
is declared as
const int& m
. What makes this even more puzzling is that just above in the same code I have:
for(i=1;i<(m-1);i++) a2[i]=(x[i]+x[i+m-1])*0.5f;
and the loop above gets vectorized just fine (here a2
and x
are floats
). I'm compiling with the
-Ofast -ftree-vectorizer-verbose=10 -mtune=native -march=native
flags on GCC 4.8.1 on a i7.
Thanks in advance,
After @nodakai idea, I tried this:
const int mm = m;
for(i=0;i<(m-1);i++) N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;
this didn't get me quiet there:
babar.cpp:234: note: not vectorized: relevant stmt not supported: D.55255_812 = D.55254_811 / N0_34;
babar.cpp:234: note: bad operation or unsupported loop bound.
so of course, I tried:
const int mm=m;
const float G0=1.0f/(float)N0;
for(i=0;i<(mm-1);i++) N4[i]=(i+mm-1-Rigta[i]-1-N3[i])*G0;
which then produced:
babar.cpp:235: note: LOOP VECTORIZED.
(e.g. success). Oddly enough, the mm
seems necessary(?!).
Upvotes: 6
Views: 2120
Reputation: 4490
Your loop bounds probably do not divide by the vectorization factor. Note that in the loop that vectorizes, the loop iterates for one less time than the one that does not. As a simple test to see if this is the case, you can change the starting point of your non-vectorized loop to 1
and then do the 0
case prior to the loop, like:
N4[0] = (m - 1 - Rigta[0] - 1 - N3[0]) / N0;
for(i=1; i<(m-1); i++) {
N4[i]=(i + m - 1 - Rigta[i] - 1 - N3[i])/N0;
}
Upvotes: 1
Reputation: 8001
Can you try these two steps and see if there's any differences?
const int mm = m;
just before the loop.m
with mm
.Upvotes: 3