Reputation: 279
I have few global variables which are being used by various functions in the C
program. I am using OpenMP
threads in parallel. Each thread will call these functions assigning different values to these global variables. Is there any other alternative to threadprivate
? It's not clear to me how to use copyin
clause.
The sample code is like this:
int main (void) {
int low=5,high=0;
----
func1(int *k) { do something with low,high and set value for k}
func2(int *k) { do something with low,high and set value for k}
func3(int *k) { do something with low,high and set value for k}
----
int i;
int *arr= malloc(CONSTANT1*sizeof(int));
#pragma omp parallel num_threads(numberOfThreads) threadprivate(low,high) private(i) shared(arr)
{
#pragma omp for
for(i=0;i<CONSTANT1;i++) {
low=low+CONSTANT2*i;
high=low+CONSTANT2;
func1(&arr[i]);
func2(&arr[i]);
func3(&arr[i]);
----
}
}
}
Or shall I use private(low,high)
and pass them again and again to each function?
Please advise.
Upvotes: 0
Views: 1282
Reputation: 9519
Your code snippet is quite obscure but seem buggy. Let's assume you had the following in mind when you asked your question:
int low=5, high=10;
#pragma omp threadprivate(low, high)
func1(int *k) { do something with low,high and set value for k}
func2(int *k) { do something with low,high and set value for k}
func3(int *k) { do something with low,high and set value for k}
[...]
int main (void) {
[...]
int i;
int *arr= malloc(CONSTANT1*sizeof(int));
#pragma omp parallel num_threads(numberOfThreads) private(i)
{
#pragma omp for
for (i=0; i<CONSTANT1; i++) {
low = low + CONSTANT2 * i;
high = low + CONSTANT2;
func1(&arr[i]);
func2(&arr[i]);
func3(&arr[i]);
[...]
}
}
}
Then, although the use of threadprivate
makes the code valid, you have a problem here because of low = low + CONSTANT2 * i;
. This line depends on the previous value of low
and is therefore not suited for parallelisation since the order matters. However, if you change your code like this:
int lowinit = low;
#pragma omp for
for (i=0; i<CONSTANT1; i++) {
low = lowinit + CONSTANT2 * i*(i+1)/2;
Then your code becomes correct (provided your functions do not change low
internally).
In term of performance, I'm not sure the global versus parameter aspect of high
and low
will make much of a difference. However, it is clear to me that having them passed as parameters rather than global variable makes the code much cleaner and less error prone.
Finally, if the values of high
and low
have any sort of importance upon exit of the parallel loop or region, be aware that this is the ones of the master thread that will be kept, which are likely to be different from the ones they would have had without OpenMP. In such case, you can add these lines to your code where necessary to ensure correctness:
low = lowinit + CONSTANT2 * (CONSTANT1-1)*CONSTANT1/2;
high = low + CONSTANT2;
Upvotes: 1