Reputation: 1959
I want to vectorize following loop in C:
for(k = 0; k < SysData->numOfClaGen; k++)
A[k] = B[k] * cos(x1[2 * k] - x1[ind0 + k]);
where, there is no alias between variables and ind0
is a constant. None of the other pointers (A
or B
) point to ind0 and therefore, ind0
remains constant throughout the loop.
When I compile the code with icc, it says that this loop cannot be vectorized due to possible vector dependence. Here is the message:
loop was not vectorized: existence of vector dependence.
I narrowed the problem down and found out that replacing ind0 with a constant number solves the problem. So, I assume that icc thinks A
may point to ind0
and therefore, ind0
may change.
I would like to know how I can help the compiler to know that it is safe to vectorized such loop.
Thanks in advance for your help.
Upvotes: 0
Views: 1781
Reputation: 620
icc was changed a year ago to set -ansi-alias as a default for linux and Mac. For Windows, this default can't be counted on, as it conflicts with Microsoft usage. This option is equivalent to gcc -fstrict-aliasing, which has been a default since gcc 3.0. I think it's much better to set this option than to set ivdep restrict or simd for such a limited issue. Although it's not well documented, icc treats __restrict the same as gcc and doesn't require the restrict or C99 option to accept it. In principle, it should come into play only for the objects being modified (A[] in the example above). Strangely, __restrict has a slightly different meaning for MSVC++. It permits non-vector optimizations which might otherwise be prevented by possible dependencies, but doesn't enable vectorization (but it might apply to the present case).
Upvotes: 0
Reputation: 26205
Use of the restrict
modifier for pointers asserts to the compiler that there is no aliasing. This keyword was introduced in C99. C++ does not support it, but many C++ compilers support __restrict
as an equivalent proprietary extension. With the Intel compiler, one has to enable use of restrict
by adding the command line flag -restrict
(Linux) or /Qrestrict
(Windows). In the following version of your code the loop is vectorized as desired when using Intel compiler version 13.1.3.198:
#include <math.h>
struct bar {
int numOfClaGen;
};
void foo (double * restrict A,
const double * restrict B,
const double * restrict x1,
const struct bar * restrict SysData,
const int ind0)
{
int k;
for (k = 0; k < SysData->numOfClaGen; k++) {
A[k] = B[k] * cos(x1[2 * k] - x1[ind0 + k]);
}
}
Invoking the compiler as follows (on a 64-bit Windows system)
icl /c /Ox /QxHost /Qrestrict /Qvec-report2 vectorize.c
the compiler reported
vectorize.c(14): (col. 5) remark: LOOP WAS VECTORIZED.
Upvotes: 1
Reputation: 9779
Add #pragma ivdep
in front of the for loop, it instructs the compiler to ignore assumed vector dependencies.
#pragma ivdep
for(k = 0; k < SysData->numOfClaGen; k++)
A[k] = B[k] * cos(x1[2 * k] - x1[ind0 + k]);
for more info about ivdep, see icc doc
Upvotes: 2