Reputation: 20119
Suppose an array arr
of SIZE=128Mb with values from 0 to 128Mb-1. Now suppose the following code:
#pragma omp parallel num_threads(NUM_THREADS)
{
int me = omp_get_thread_num();
odds_local[me] = 0;
int count = 0;
#pragma omp for
for (int i = 0; i < SIZE; i++)
if (arr[i]%2 != 0)
count++;
odds_local[me] = count;
}
and finally a loop that iterates over the values of odds_local[me]
to get the final result. For this, if I time it and report user time in Linux I get 0.97s for both 1 thread and 2 threads. That is to say, no speedup whatsoever.
Is there anything I should be improving in this program to better the speedup?
Upvotes: 0
Views: 116
Reputation: 2318
I ran your exact code and with 1 thread I get 390ms, with 2 I get 190ms. Your problem is not in the code. It has to be something basic. These are the things I can think of:
g++ filename -fopenmp
);Upvotes: 1