Circus
Circus

Reputation: 31

OpenMP: Shared variables in single, nowait construct

so I have this little snippet of code:

int a = 10;
bool finished = false;

#pragma omp parallel num_threads(3) shared(a, finished)
{
    while(!finished) {

        #pragma omp single nowait
        {
            printf("[%d] a is: %d\n", omp_get_thread_num(), a);
            a--;
            finished = true;
        }

    }
}

The output is

[0] a is: 10
[2] a is: 10
[1] a is: 10

Which is not what I expect at all. I realise that all the threads may make it into the single construct before exiting the while loop, but why do they all say the same a? The second thread that enters should have a = 9, and the third one a = 8. I've tried #pragma omp flush and #pragma omp atomic on a but to no use. I'd like to use a for a comparison in that block (i.e. if(a == 10)) so it's crucial that the value is updated once another thread enters the single block. What am I doing wrong?

Upvotes: 1

Views: 3699

Answers (3)

user0815
user0815

Reputation: 1406

The problem with your code is that the single directive has basically no effect. By specifying the nowait clause other threads will not wait at the end of the block and instead enter the next loop iteration immediately.

That means, the first thread arriving at the single construct will execute the block in the first iteration. The remaining threads will skip the block and immediately arrive again at the single construct in their second iteration. One of the two remaining threads then enters the block as well (since it's a new iteration and thus another occurence of the block). The other again skips and enters immediately in its third iteration. At the end, all threads execute nearly concurrently and print the initial value of a since none of the threads has succeeded to decrement a until this point in time.

You will see a changing if you swap the statements inside the single block as follows. However, the values and their order will be undetermined.

#pragma omp single nowait
{
    a--;
    printf("[%d] a is: %d\n", omp_get_thread_num(), a);
    finished = true;
}

Upvotes: 2

Gilles
Gilles

Reputation: 9489

I really am not sure what you try to achieve here... Actually, without any sort of work done by other threads outside of the single block, I don't see the point of the structure.

Anyway, I tried to extrapolate a bit and extended your example by adding a printf() statement outside of the block which also prints the value of a to see how this is transmitted to the other threads. Moreover, since you used a single directive, I assumed you wanted only one thread executing the block, even if it is on a while() loop. So it looked very well suited for the use of OpenMP locks...

Here is what I came up with:

#include <stdio.h>
#include <omp.h>
#include <unistd.h>

int main() {
    int a = 10;
    bool finished = false;
    omp_lock_t lock;
    omp_init_lock( &lock );
    #pragma omp parallel num_threads( 3 ) shared( a, finished )
    {
        while( !finished ) {
            if ( omp_test_lock( &lock ) ) {
                printf( "[%d] a is: %d\n", omp_get_thread_num(), a );
                #pragma omp atomic update
                a--;
                usleep( 10 );
                finished = true;
                #pragma omp flush( finished )
                omp_unset_lock( &lock );
            }
            #pragma omp flush( finished, a )
            printf( "[%d] outside of if block, a is: %d\n", omp_get_thread_num(), a );
        }
    }
    return 0;
}

I've added a call to usleep() to delay a bit the execution of the instructions inside the if block and give the other threads the opportunity to print something. I've tested it with gcc version 4.9 and 5.3 and Intel compiler 16.0, and all 3 give me the same sort of output, (with obviously some variations in the order and number of printings between runs).

The results looks like this:

~/tmp$ icpc -fopenmp omplock.cc
~/tmp$ ./a.out 
[0] a is: 10
[1] outside of if block, a is: 10
[1] outside of if block, a is: 9
[1] outside of if block, a is: 9
[1] outside of if block, a is: 9
[1] outside of if block, a is: 9
[1] outside of if block, a is: 9
[2] outside of if block, a is: 10
[1] outside of if block, a is: 9
[0] outside of if block, a is: 9

Would this approach address your needs?

Upvotes: 0

NoseKnowsAll
NoseKnowsAll

Reputation: 4623

I don't believe you want a single thread performing your code block, but rather you want to ensure that each thread is not stomping over the other threads' data in a and finished. This is accomplished with #pragma omp critical. Furthermore, you cannot have a nowait clause because by definition, the other threads will not enter the code block until the other thread has finished the critical region.

while(!finished) {

    // only 1 thread will enter this region at a time
    #pragma omp critical
    {
        printf("[%d] a is: %d\n", omp_get_thread_num(), a);
        a--;
        finished = true;
    }

}

Also, note that having this sort of thread inter-dependence is likely to kill performance. I would suggest you avoid this and change your algorithm unless you see no other way to do it.

Upvotes: 0

Related Questions