Reputation: 467
I am writing a program in C (a 2d poisson solver) and I am using openMP to speed-up a big for loop. What I observed is that inside an openMP parallel block, the for loop is not vectorized even in the case where I include the #pragma always vector directive. For the compilation I am using the pathscale compiler.
The code I want to vectorize looks like this :
#pragma omp parallel shared(in, out, lambda,dim,C) private(k)
{
#pragma omp for schedule(guided,dim/nthreads) nowait
for(k = 0;k < dim; k++){
in[k] = C*out[k]*lambda[k];
}
}
where out,lambda and in are double precision arrays.
But even if I include #pragma always vector, what the compiler answers is :
warning: ignoring #pragma always vector
Do you know if there is any workaround for this?
Thanks.
Upvotes: 0
Views: 1986
Reputation: 12784
I looked through the User Guide for the PathScale compiler, and did not find neither #pragma always
nor #pragma vector
. So I think the compiler just tells you that it does not recognize this pragma, and ignores it.
However in section 7.4.5 I found the following options that should help you with vectorization:
Vectorization of user code ... is controlled by the flag
-LNO:simd[=(0|1|2)]
, which enables or disables inner loop vectorization. 0 turns off the vectorizer, 1 (the default) causes the compiler to vectorize only if it can determine that there is no undesirable performance impact due to sub-optimal alignment, and 2 will vectorize without any constraints (this is the most aggressive).
-LNO:simd_verbose=ON
prints vectorizer information (from vectorizing user code) to stdout.
As a side note (guessing where you could take that #pragma always vector
from), Intel's compiler has #pragma vector
with always
being one possible parameter to the pragma. But pragmas are generally compiler-specific, except for few extensions (OpenMP being one) that are supported by multiple vendors.
Upvotes: 3