Denys P.
Denys P.

Reputation: 266

OpenMP, use all cores with parallel for

I have computer with 4 cores and OMP application with 2 weighty tasks.

int main()
{
    #pragma omp parallel sections
    {
        #pragma omp section
        WeightyTask1();

        #pragma omp section
        WeightyTask2();
    }

    return 0;
}

Each task has such weighty part:

#omp pragma parallel for
for (int i = 0; i < N; i++)
{
    ...
}

I compiled program with -fopenmp parameter, made export OMP_NUM_THREADS=4. The problem is that only two cores are loaded. How I can use all cores in my tasks?

Upvotes: 6

Views: 6012

Answers (1)

sehe
sehe

Reputation: 392893

My initial reaction was: You have to declare more parallelism.

You have defined two tasks that can run in parallel. Any attempt by OpenMP to run it on more than two cores will slow you down (because of cache locality and possible false sharing).

Edit If the parallel for loops are of any significant volume (say, not under 8 iterations), and you are not seeing more than 2 cores used, look at

  • omp_set_nested()
  • the OMP_NESTED=TRUE|FALSE environment variable

    This environment variable enables or disables nested parallelism. The setting of this environment variable can be overridden by calling the omp_set_nested() runtime library function.

    If nested parallelism is disabled, nested parallel regions are serialized and run in the current thread.

    In the current implementation, nested parallel regions are always serialized. As a result, OMP_SET_NESTED does not have any effect, and omp_get_nested() always returns 0. If -qsmp=nested_par option is on (only in non-strict OMP mode), nested parallel regions may employ additional threads as available. However, no new team will be created to run nested parallel regions. The default value for OMP_NESTED is FALSE.

Upvotes: 5

Related Questions