John
John

Reputation: 125

How openmp works when using fewer threads than have

My computer has 16 cores. My program is like the following:

omp_set_num_threads(16);
....
#pragma omp parallel for num_threads(2)
for(int i =1; i<=2; ++i)
{
 \\time consuming operations
 }

Which is more efficient #pragma omp parallel for num_threads(2) or #pragma omp parallel for num_threads(16)? Or they are the same, since it is shared memory? Pay attention to the fact that my loop iterations are smaller than 16.

Upvotes: 1

Views: 1273

Answers (3)

Zulan
Zulan

Reputation: 22660

Omit any manual specification such as omp_set_num_threads or num_threads and let the implementation figure it out.

Practically, it should make no noticeable difference either way.

omp_set_num_threads is completely redundant, as it only applies to subsequent parallel regions that do not specify a num_threads clause. So if you feel like you must, use either omp_set_num_threads or a num_threads clause as it is just confusing to the reader.

It is conceivable that num_threads(2), however specified, is better. It has a smaller initialization overhead for creating less threads. That probably doesn't matter. There is a theoretical argument the excessive threads which have no work to do could drain shared resources while waiting (shared cores with hyperthreading, powercap) - still it should not matter because OpenMP implementations don't do indefinite busy waiting.

On the other hand, manually specifying num_threads(2) creates a redundancy. What if your loop changes to three iterations, but you forget? You waste performance. Same goes for "I put num_threads(X) because I have X cores", kind of code.

Again, just omit it. However, measure your application regularly. If you have specific indication of possibly related performance issues - reevaluate the choice based on specific actionable measurements.

Upvotes: 1

Emmet
Emmet

Reputation: 6401

It looks like your for-loop isn't really a for-loop at all: you only have two iterations. A better solution might be to use OpenMP sections:

#pragma omp parallel sections
{
    #pragma omp section
    {
        // Time-consuming operations
    }
    #pragma omp section 
    {
        // Other independent time-consuming operations
    }
}

Upvotes: 0

HEKTO
HEKTO

Reputation: 4191

Your OpenMP pragma is not correct, it should be:

#pragma omp parallel for num_threads(2)

Please look here for explanations - and this whole article is very good for learning the OpenMP.

Upvotes: 0

Related Questions