nullgraph
nullgraph

Reputation: 307

OpenMP nested parallelism with sections

I have the following situation: I have a big outer for loop that essentially contains a function foo(). Within foo(), there are bar1() and bar2() that can be carried out concurrently, and bar3() that need to be performed after bar1() and bar2() are done. I have parallelized the big outer loop, and section bar1() and bar2(). I assume that each outer loop thread will generate their own section threads, is this correct?

If the assumption above is correct, how do I get bar3() to perform only after threads carrying out bar1() and bar2() finished? If I use critical, it will halt on all threads, including the outer for loop. If I use single, there's no guarantee that bar1() and bar2() will finish.

If the assumption above is not correct, how do I force the outer loop threads to resuse threads for bar1(), bar2() and not generate new threads every time?

Note that temp is a variable whose init and clear are expensive so I pull init and clear outside the for loop. It further complicates matter because both bar1() and bar2() needs some kind of temp variable. Optimally, temp should be init and cleared for each thread that is created, but I'm not sure how to force that for the threads generated for sections. (Without the sections pragma, it works fine in the parallel block).

main(){
    #pragma omp parallel private(temp)
    init(temp);
    #pragma omp for schedule(static)
    for (i=0;i<100000;i++) {
        foo(temp);
    }
    clear(temp);
}

foo() {
    init(x); init(y);
    #pragma omp sections
    {
        { bar1(x,temp); }
        #pragma omp section
        { bar2(y,temp); }
    }
    bar3(x,y,temp);
}

Upvotes: 0

Views: 1483

Answers (1)

warunapww
warunapww

Reputation: 1006

I believe that simply parallelizing the for loop should give you enough parallelism to saturate the resources in CPU. But if you really want to run two functions in parallel, following code should work.

main(){
    #pragma omp parallel private(temp) 
    {
        init(temp);
        #pragma omp for schedule(static)
        for (i=0;i<100000;i++) {
            foo(temp);
        }
        clear(temp);
    }
}

foo() {
    init(x); init(y);

    #pragma omp task
    bar1(x,temp);

    bar2(y,temp);

    #pragma omp taskwait

    bar3(x,y,temp);
}

Upvotes: 1

Related Questions