Reputation: 45
I've been assigned to implement the idea of a reduction variable without using the reduction clause. I set up this basic code to test it.
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
for (int i = 0; i < n; ++i)
{
val += 1;
}
sum += val;
so at the end sum == n
.
Each thread should set val as a private variable, and then the addition to sum should be a critical section where the threads converge, e.g.
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel for private(i, val) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
I can't figure out how to maintain the private instance of val for the critical section. I have tried surrounding the whole thing in a larger pragma, e.g.
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel private(val) shared(sum)
{
#pragma omp parallel for private(i) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
but I don't get the correct answer. How should I set up the pragmas and clauses to do this?
Upvotes: 3
Views: 3952
Reputation: 74385
You do not need to explicitly specify shared variables in OpenMP as variables from outer scopes are always shared by default (unless default(none)
clause is specified). As private
variables have undefined initial values, you should zero the private copy before the accumulation loop. Loop counters are automatically recognised and made private - no need to explicitly declare them as such. Also since you are simply updating a value, you should use an atomic
construct as it is more lightweight than the full critical section.
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel private(val)
{
val = 0.0;
#pragma omp for num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp atomic update
sum += val;
}
The update
clause was added to the atomic
construct in OpenMP 3.1 so if your compiler conforms to an earlier OpenMP version (e.g. if you use MSVC++ which only supports OpenMP 2.0 even in VS2012) you would have to remove the update
clause. As val
is not used outside the parallel loop, it could be declared in the inner scope as in the answer of veda and then it automatically becomes a private variable.
Note that parallel for
is a shortcut for nesting two OpenMP constructs: parallel
and for
:
#pragma omp parallel for sharing_clauses scheduling_clauses
for (...) {
}
is equivalent to:
#pragma omp parallel sharing_clauses
#pragma omp for scheduling_clauses
for (...) {
}
This is also true for the other two combined constructs: parallel sections
and parallel workshare
(Fortran only)
Upvotes: 2
Reputation: 6594
There are pretty much flaws in your program. Lets look at each program (flaws are written as comments).
Program one
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel for private(i, val) shared(n) num_threads(nthreads)
for (int i = 0; i < n; ++i)
{
val += 1;
}
// At end of this, all the openmp threads die.
// The reason is the "pragma omp parallel" creates threads,
// and the scope of those threads were till the end of that for loop. So, the thread dies
// So, there is only one thread (i.e. the main thread) that will enter the critical section
#pragma omp critical
{
sum += val;
}
Program two
int i = 0;
int n = 100000000;
double sum = 0.0;
double val = 0.0;
#pragma omp parallel private(val) shared(sum)
// pragma omp parallel creates the threads
{
#pragma omp parallel for private(i) shared(n) num_threads(nthreads)
// There is no need to create another set of threads
// Note that "pragma omp parallel" always creates threads.
// Now you have created nested threads which is wrong
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
The best solution would be
int n = 100000000;
double sum = 0.0;
int nThreads = 5;
#pragma omp parallel shared(sum, n) num_threads(nThreads) // Create omp threads, and always declare the shared and private variables here.
// Also declare the maximum number of threads.
// Do note that num_threads(nThreads) doesn't guarantees that the number of omp threads created is nThreads. It just says that maximum number of threads that can be created is nThreads...
// num_threads actually limits the number of threads that can be created
{
double val = 0.0; // val can be declared as local variable (for each thread)
#pragma omp for nowait // now pragma for (here you don't need to create threads, that's why no "omp parallel" )
// nowait specifies that the threads don't need to wait (for other threads to complete) after for loop, the threads can go ahead and execute the critical section
for (int i = 0; i < n; ++i)
{
val += 1;
}
#pragma omp critical
{
sum += val;
}
}
Upvotes: 6