dimpep
dimpep

Reputation: 73

Parallel update of array contents in OpenMP - Concurrent add element

I have the following code which I would like to make it parallel (pseudocode)

int na = 10000000;
int nb = na;
double A[na];
double B[2*na];
double a;

for(int j=0;j<nb;j++)
{
  i = rand() % na;
  A[i]+=5.0*i;
  B[i+10]+=6.0*i*i;
}

Of course, I cannot use #pragma omp parallel for because sometimes (which cannot be predicted) the same element will be accessed by two threads at the same time. How can this block of code be parallelized? Thank you

Upvotes: 2

Views: 3600

Answers (1)

Zulan
Zulan

Reputation: 22660

There are two ways to do that:

  • Use an atomic update on the values

    #pragma omp parallel for
    for(int j=0;j<nb;j++)
    {
        // make sure to declare i locally!
        int i = fun();
        #pragma omp atomic
        A[i]+=5.0*i;
    }
    

    This is the simplest way. Each write is executed atomically and therefore more expensive. You also need to consider that accessing adjacent elements from multiple threads becomes expensive (false sharing). Use this if A is large and you do a lot of computations per uptate.

  • Use an array-reduction

    #pragma omp parallel for reduction(+:A)
    for(int j=0;j<nb;j++)
    {
        // make sure to declare i locally!
        int i = fun();
        A[i]+=5.0*i;
    }
    

    This creates a local copy of A for each thread which is added together to the outside A after the parallel region. This requires more memory and some computation after, but parallel code itself can work most efficiently. Use this if A is small and the are little computations for each update.

BTW: Never use rand() in parallel applications, it is not defined as thread safe and sometimes it is implemented with a lock and becomes horribly inefficient.

EDIT: In your example with B you can safely apply either omp atomic or reduction separately to the statement since each operation only needs to be performance atomically independently.

Upvotes: 1

Related Questions