Reputation: 21
I'm trying to parallelise the FEM1D code that can be found here. The part that's relevant is:
for ( i = 1; i < nu - 1; i++ )
{
adiag[i] = adiag[i] - aleft[i] * arite[i-1];
arite[i] = arite[i] / adiag[i];
}
simply adding
#pragma omp parallel for
before the loop does not work and im not sure why. I assume that its because the other threads need to update the arrays but since i is private the threads shouldn't need to update anything required by another thread?.
i've tried making new variables making them private but im pretty sure its to do with updating the adiag and arite arrays so i tried the flush directive which specifies that all threads have the same view of memory for all shared objects, but again no dice.
#pragma omp parallel for private(i,ad,al,ar)
for ( i = 1; i < nu - 1; i++ )
{
#pragma omp flush(adiag, arite, aleft)
ad = adiag[i];
al = aleft[i];
ar = arite[i-1];
adiag[i] = ad - al * ar;
ar = arite[i];
ad = adiag[i];
arite[i] = ar / ad;
}
so im pretty stuck here, any advice to help me along would be much appreciated.
EDIT: by does not work i mean that the arrays adiag and arite are incorrectly filled in after the loop completed
EDIT2: ive gotten the loop to work with
#pragma omp parallel for ordered
for ( i = 1; i < nu - 1; i++ )
{
#pragma omp ordered
adiag[i] = adiag[i] - aleft[i] * arite[i-1];
arite[i] = arite[i] / adiag[i];
}
but I believe it kinda defeats the purpose of parallelising the loop in the first place
Upvotes: 2
Views: 84
Reputation: 305
I Don't think you can convert this loop. You have a cyclical data dependency. Usually when you have a loop that can be converted, you swap the commands order and you solve your loop problem like so:
for ( i = 1; i < nu - 1; i++ )
{
arite[i-1] = arite[i-1] / adiag[i-1];
adiag[i] = adiag[i] - aleft[i] * arite[i-1];
}
Although, when you do that, you steel need to use a value computed by another thread. I could be wrong though...
Upvotes: 1