Walter Fabio Simoni
Walter Fabio Simoni

Reputation: 5729

Synchronization with OpenMP, For directive

I would like to make a little sample code in order to test the Open MP API. I have made a three level For loop with a calcul in this.

The problem is that my result is wrong.

Here is my code :

long value = 0;
#pragma omp parallel
{
#pragma omp for
for (int i=0;i<=9999;i++)
{
    value += (M_PI * i * i -12,33 * M_PI)- M_PI;

    for (int j=0;j<=888;j++)
    {
        value += (M_PI * j * i -12,33 * M_PI)- M_PI;

        for (int k=0;k<=777;k++)
        {
            value += (M_PI * k * j -12,33 * M_PI)- M_PI;    
        }
    }
}
}    

My problem :

Without Open MP, the value of the value variable is : 191773766 Whit Open MP, the value of the value variable is : 1092397966

I think that is a synchronization problem, but how to solve this ? I have read a lot about Open MP, but I don't find how solve it.

Thanks a lot,

Best regards,

Upvotes: 2

Views: 558

Answers (1)

Mysticial
Mysticial

Reputation: 471199

You're missing the reduction(+:value) clause.

#pragma omp parallel reduction(+:value)  //  add reduction here
{
#pragma omp for

The reason why you need it is because you are sharing the value variable across all threads. So they asynchronously update it leading to a race condition. (You also get a performance hit from cache coherency.)

The reduction(+:value) clause tells the compile to create a separate instance of value for each thread and then sum them up at the end.


EDIT : Full code at OP's request.

int main() {

    double start = omp_get_wtime();

    long M_PI = 12;

    long value = 0;
#pragma omp parallel reduction(+:value)
{
#pragma omp for
for (int i=0;i<=9999;i++)
{
    value += (M_PI * i * i -12,33 * M_PI)- M_PI;

    for (int j=0;j<=888;j++)
    {
        value += (M_PI * j * i -12,33 * M_PI)- M_PI;

        for (int k=0;k<=777;k++)
        {
            value += (M_PI * k * j -12,33 * M_PI)- M_PI;    
        }
    }
}
}    
    double end = omp_get_wtime();
    printf("\n\nseconds = %f\n",end - start);

    cout << value << endl;

    system("pause");
    return 0;
}

Output: (without OpenMP)

seconds = 0.007816
738123776

Output: (with OpenMP - 8 threads)

seconds = 0.012784
738123776

If you want any speedup, you need to make the task much larger.

Upvotes: 7

Related Questions