Multi-threaded speed-up only after making array private

Question

I am trying to learn multi-threaded programming using openmp.

To begin with, I was testing out a nested loop with a large number of array access operations, and then parallelizing it. I am attaching the code below. Basically, I have this fairly large array tmp in the interior loop, and if I make it shared so that every thread can access and change it, my code actually slows down with increasing number of threads. I have written it so that every thread writes the exact same values to array tmp. When I make tmp private, I get speed up proportional to the number of threads. The no. of operations seem to me to be exactly the same in both cases. Why is it slowing down when tmp is shared ? Is it because different threads try to access the same address at the same time ?

int main(){
    int k,m,n,dummy_cntr=5000,nthread=10,id;
    long num=10000000;
    double x[num],tmp[dummy_cntr];
    double tm,fact;
    clock_t st,fn;

    st=clock();
    omp_set_num_threads(nthread);
#pragma omp parallel private(tmp)
    {
        id = omp_get_thread_num();
        printf("Thread no. %d 
",id);
#pragma omp for
        for (k=0; k



P.S.: I am aware that using clock() here doesn't really give the correct time. I have to divide it by the no. of threads in this case to get a similar output as given by "time ./a.out".

Pragmateek · Accepted Answer

This may be due to cache contention: if a part of the array is accessed by two threads or more it will be cached multiple times, one copy for each core: when one core needs to access it, if the data have been changed, it will need to fetch the latest version from another core cache which takes some time.

Multi-threaded speed-up only after making array private

Answers (2)

Related Questions