Reputation: 6255
I would like to "nest" parallel for using OpenMP. Here is a toy code:
#include <iostream>
#include <cmath>
void subproblem(int m) {
#pragma omp parallel for
for (int j{0}; j < m; ++j) {
double sum{0.0};
for (int k{0}; k < 10000000; ++k) {
sum += std::cos(static_cast<double>(k));
}
#pragma omp critical
{ std::cout << "Sum: " << sum << std::endl; }
}
}
int main(int argc, const char *argv[]) {
int n{2};
int m{8};
#pragma omp parallel for
for (int i{0}; i < n; ++i) {
subproblem(m);
}
return 0;
}
Here is what I want:
So far, I have only found a solution that disables nested parallelism or always allow it, but I am looking at a way to enable it only if the number of threads launched is below the number of cores.
Is there an OpenMP solution for that using tasks?
Upvotes: 3
Views: 1337
Reputation: 2039
Does taskloop address your unsimplified issue? The third code block from this page shows it in use. and below is your code updated with it:
#include <iostream>
#include <cmath>
void subproblem(int m) {
#pragma omp taskloop
for (int j{0}; j < m; ++j) {
double sum{0.0};
for (int k{0}; k < 10000000; ++k) {
sum += std::cos(static_cast<double>(k));
}
#pragma omp critical
{ std::cout << "Sum: " << sum << std::endl; }
}
}
int main(int argc, const char *argv[]) {
int n{2};
int m{8};
#pragma omp parallel for
for (int i{0}; i < n; ++i) {
subproblem(m);
}
return 0;
}
Upvotes: 0
Reputation: 9519
Doesn't the if
clause of the parallel
construct just do it all for you?
Here is what the 4.0 OpenMP standard says on page 44:
The syntax of the parallel construct is as follows:
#pragma omp parallel [clause[ [, ]clause] ...] new-line structured-block
where clause is one of the following:
if(scalar-expression)
num_threads(integer-expression)
default(shared | none)
private(list)
firstprivate(list)
shared(list)
copyin(list)
reduction(redution-identifier:list)
proc_bind(master | close | spread)
I didn't try, but I guess that using the if clause just the way you described your two bullet points for whether n is greater than the number of cores on your machine might just work... Would you care to give it a try and let us know?
Upvotes: 1
Reputation: 34591
Rather than using a pair of nested parallel sections, you can tell OpenMP to "collapse" the nested loops into a single parallel section over the n*m iteration space:
#pragma omp parallel for collapse(2)
for (int i{0}; i < n; ++i) {
for (int j{0}; j < m; ++j) {
// ...
}
}
This will allow it to divide the work appropriately regardless of the relative values of n and m.
Upvotes: 5
Reputation: 1458
OMP_NUM_THREADS
- Specifies the default number of threads to use in parallel regions. The value of this variable shall be a comma-separated list of positive integers; the value specified the number of threads to use for the corresponding nested level. If undefined one thread per CPU is used. (from here)
omp_get_max_threads
- maximum number of threads that are available to do work (from here)
omp_get_num_threads
- number of threads in the current team (from here)
But AFAIK there is no function to get number of all running threads ( it's what you request:
I don't want the total number of threads to exceed the number of cores on my machine
)
Also look at this question
Upvotes: 1