Stephen
Stephen

Reputation: 3621

OpenMP: Share single-threaded and multi-threaded implementations of the same algorithm

I'm working in a code base where several algorithms are implemented twice: once with a #pragma omp parallel in just the right place, and once without. The functions are named things like AlgorithmMT() and AlgorithmST().

Simplified example:

/// Multi-threaded algorithm
std::vector<double>
AlgorithmMT(int n)
{
    std::vector<double> result(n);
    std::itoa(result.begin(), result.end(), 1.0);
#pragma omp parallel for
    for (int i = 0; i < n; ++i) {
        result[i] = i / result[i];
    }
    return result;
}

/// Single-threaded algorithm
std::vector<double>
AlgorithmST(int n)
{
    std::vector<double> result(n);
    std::itoa(result.begin(), result.end(), 1.0);
// NOTE: there is no #pragma here
    for (int i = 0; i < n; ++i) {
        result[i] = i / result[i];
    }
    return result;
}

Assuming that I need to preserve two separate functions (higher level code can't be changed), and that users should be allowed to select between them at run time, how can I get the two functions to share a common implementation?

I realize that the algorithm is a bit nonsensical and could be implemented without a read dependency on result inside the loop. Please just assume this is the required structure of the algorithm. :)

Upvotes: 1

Views: 61

Answers (3)

Zulan
Zulan

Reputation: 22670

A clean way is to use the if clause of parallel constructs, e.g.:

bool is_parallel = ...;
#pragma omp parallel for if (is_parallel)

This is evaluated at run-time and per definition has the same effect of spawning a single thread.

This runtime distinction isn't exactly the same as omitting the pragma, the compiler may optimize the code differently. While I wouldn't worry too much, you should observe your performance. Just compile the application without -fopenmp and compare performance with dynamically disabled parallelism. If there is a discrepancy, you may have to resort to redundant code or help the compiler in some way. Note that the performance may vary among compilers.

Upvotes: 4

shinjin
shinjin

Reputation: 3027

You can use omp_set_num_threads from the OpenMP runtime API to limit the thread count to one before your parallel section, then restore it after the section.

Warning: If there's an other parallel thread running already, then omp_set_num_threads will affect the parallel sections there as well.

Upvotes: 1

Arthur Woimb&#233;e
Arthur Woimb&#233;e

Reputation: 114

Since #pragma is preprocessor you cannot do a whole lot sadly. You could create algorithm.cpp.part with your function inside but %parallel% in place of #pragma omp for and then, at compile time, replace the text with something like this in a makefile:

sed '/%parallel%/c\#pragma omp parallel'  algorithm.cpp.part > algorithm_mt.cpp
sed '/%parallel%/c\ '  algorithm.cpp.part > algorithm_st.cpp

If you have lots of functions like this it could scale relatively well with a good makefile rule.

Or if you are compiling for Windows you could use the concurrency runtime, it avoids using #pragma, which could be useful in your situation.

(don't be too harsh with this answer, I'm writing this on my phone)

Upvotes: 0

Related Questions