StephanieLoves
StephanieLoves

Reputation: 37

Different answers when parallelize

#pragma omp parallel for    
for (int i = 0; i <500; i++)
   for (j=i; j < 102342; j++)
    {
      Output[j] += staticConstant[i] * data[j-i];
    }
}

Some of the vector answers are the same and some are different. What may be the reason for this? At first I thought it may be due to float, so I converted everything to doubles. There would be 5-6 answers identical, or larger blocks with random values very close, and a few quite far off.

Upvotes: 0

Views: 55

Answers (2)

Ken Y-N
Ken Y-N

Reputation: 15009

The problem is that your inner loop has a write race condition, so, for instance, the following two statements could happen in parallel:

Output[42] = Output[42] + staticConstant[9] * data[42-9];
Output[42] = Output[42] + staticConstant[19] * data[42-19];

What the code boils down to for each line is:

Load O[42] to R1
Load C[] to R2
Add R2 to R1
Store R1 to O[42]

However, the paralleliser could result in your code running like this:

Load O[42] to R1
Load O[42] to R3
Load C[9] to R2
Load C[19] to R4
Add R2 to R1
Add R4 to R3
Store R1 to O[42]
Store R3 to O[42]

As you can perhaps see, the two Load O[42] lines load the value before adding C[9] or C[19], so effectively the first calculation is ignored.

The easiest fix is:

for (int i = 0; i <500; i++)
{
#pragma omp parallel for    
   for (j=i; j < 102342; j++)
    {
      Output[j] += staticConstant[i] * data[j-i];
    }
}

Now as only the inner loop is parallelised there will be no race condition.

Upvotes: 1

1201ProgramAlarm
1201ProgramAlarm

Reputation: 32732

You have multiple threads writing to Output[j]. This causes a race condition. The value written by one thread is replaced by another thread with a different value.

Upvotes: 1

Related Questions