Mr.Bloom
Mr.Bloom

Reputation: 365

How do you vectorise a loop?

I'm having trouble in vectorising a loop. I'm looking to rewrite the code below such that it is vectorised. I have ran Complete Banerjee's test and I have found that all dependencies are broken but I don't know where to go from here. The compiler is gcc. The architecture is x86 and the arrays are integer arrays.

for (int i = 0; i < 100; i++) { 
     x[20 + i] = y[i] * z[i];
     p[i] = x[21 + i] + q[i];
}

Upvotes: 6

Views: 584

Answers (1)

Nate Eldredge
Nate Eldredge

Reputation: 58052

Two general tips:

  • Pass the arrays as parameters to your function, using the restrict keyword to inform the compiler that they cannot alias one another (which would prevent any vectorization).

  • Although the read from x on the second line of your loop does not depend on the write on the first line, the compiler may not be smart enough to detect that. Help it out by interchanging those two lines, or by moving the read to its own loop before the write.

The following version is successfully vectorized by gcc 10.2 with -O3 -march-skylake (try on Godbolt), using ymm registers to process 8 ints per iteration. It also unrolls the loop completely since the iteration count is constant and not too large.

void foo(
            int *restrict x,
            const int *restrict y,
            const int *restrict z,
            int *restrict p,
            const int *restrict q
        ) {
    for (int i = 0; i < 100; i++) { 
        p[i] = x[21 + i] + q[i];
        x[20 + i] = y[i] * z[i];
    }
}

Upvotes: 3

Related Questions