Reputation: 33
I have basically three questions on OpenMp.
Q1. Does OpenMp provide mutual exclusion to shared variables? Consider the following simple matrix multiplication code with three nested loops, parallelised using OpenMp in C++. Here A, B, and C are dynamically space allocated double** type of variables. Thread count is appropriately assigned a value.
#pragma omp parallel
{
int tid = omp_get_thread_num();
int fraction = (n/threadCount);
int start = tid * fraction;
int end = (tid+1) * fraction;
for (int start = 0; i < end; i++)
{
for (int j = 0; j < N; j++)
{
C[i][j] = 0;
for (int k = 0; k < N; k++)
C[i][j] += A[i][k] * B[k][j];
}
}
}
The thing here is that mutual exclusion for reading from A and B and writing to C is unnecessary. But if extra overhead is incurred due to mutex on A, B, and C, it is favourable to relieve A, B, and C of mutex. How can it be achieved?
Q2. Consider introducing two private variables tempA and tempB into the above code as follows.
double **tempA, **tempB;
#pragma omp parallel private(tempA, tempB)
{
int tid = omp_get_thread_num();
int fraction = (n/threadCount);
int start = tid * fraction;
int end = (tid+1) * fraction;
tempA = A;
tempB = B;
for (int start = 0; i < end; i++)
{
for (int j = 0; j < N; j++)
{
C[i][j] = 0;
for (int k = 0; k < N; k++)
C[i][j] += tempA[i][k] * tempB[k][j];
}
}
}
Would this strategy relieve A and B from mutex in calculations? I mean, although the same locations (referred by A and tempA, and B and tempB) are accessed by all threads, they refer them through different local variables.
Q3. Also, I would like to know about the difference in declaring the variables tempA and tempB inside the parallel code segment, instead of declaring them outside. Of course, then we won't need that private clause in the directive. Is there any other significant difference.
Upvotes: 3
Views: 3593
Reputation: 1223
By default no synchronization mechanisms are provided. But OpenMP provides a possibility to explicitly use such mechanisms. Use #pragma omp atomic
, #pragma omp atomic read
, #pragma omp atomic write
for such purposes. Another option to use critical section: #pragma omp critical
- more generic and powerful option, but not always required.
Accessing same memory location through different variables does not change anything regarding concurrent access. You should use atomics to provide guarantee.
If declaring variable inside pragma omp parallel
- they will be private for a thread. See this and this posts for more information.
Also if you are using C++11, you can use std::atomic
variables.
Upvotes: 2