sap
sap

Reputation: 332

Use OpenMP in C++11 to find the maximum of the calculated values

I am looking to find the maximum of the calculated values inside of a for loop and also store its corresponding index, max_calc_value and i_max here, below is my pseudo code. I was wondering if it is possible to do a certain kind of reduction here:

double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel for
for (int i = 20; i < 1000; i++) {
    this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
    if (this_value > max_calc_value){
        max_calc_value = this_value;
        i_max = i;
    }
}

Upvotes: 4

Views: 4361

Answers (3)

Hristo Iliev
Hristo Iliev

Reputation: 74365

The best way to handle it is to define a custom reduction operation as shown in Gilles' answer. If your compiler only supports OpenMP 3.1 or earlier (custom reduction operations were introduced in OpenMP 4.0), then the proper solution is to perform local reduction in each thread and then sequentially combine the local reductions:

double max_calc_value = -DBL_MAX; // minimum double value
int i_max = -1;
#pragma omp parallel
{
    int my_i_max = -1;
    double my_value = -DBL_MAX;

    #pragma omp for
    for (int i = 20; i < 1000; i++) {
        this_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
        if (this_value > my_value){
            my_value = this_value;
            my_i_max = i;
        }
    }

    #pragma omp critical
    {
        if (my_value > max_calc_value) {
            max_calc_value = my_value;
            i_max = my_i_max;
        }
    }
}

This minimises the synchronisation overhead from the critical construct and in a simplified way shows how the reduction clause is actually implemented.

Upvotes: 4

Gilles
Gilles

Reputation: 9489

If you feel like it, you can define a custom reduction function and use it in parallel. In your specific example, that might just make the code a bit more cumbersome than simply using a critical section. However, this might apply nicely if your actual code can globally benefit from using this custom reduction function, not only for the final parallel reduction, but also for the local ones... So in case it applies to you, here is an example on how it works:

#include <iostream>
#include <omp.h>

struct dbl_int {
    double val;
    int idx;
};

const dbl_int& max( const dbl_int& a, const dbl_int& b) {
    return a.val > b.val ? a : b;
}

#pragma omp declare reduction( maxVal: dbl_int: omp_out=max( omp_out, omp_in ) )

int main() {
    dbl_int di = { -100., -1 };
    #pragma omp parallel num_threads( 10 ) reduction( maxVal: di )
    {
        di.val = omp_get_thread_num() % 7;
        di.idx = omp_get_thread_num();
    }
    std::cout << "Upon exit, value=" << di.val << " and index=" << di.idx << std::endl;
    return 0;
}

Which gives for me:

~/tmp $ g++ -fopenmp myred.cc -o myred
~/tmp $ ./myred
Upon exit, value=6 and index=6

Upvotes: 7

trongnp
trongnp

Reputation: 9

You can try "omp critical":

double max_calc_value = -DBL_MAX; // minimum double value
#pragma omp parallel for
for (int i = 20; i < 1000; i++) 
{
    a_value = my_slow_function(large_double_vector_array, param1*i, .., param5+i);
#pragma omp critical
    {
        if (a_value > max_calc_value){
            max_calc_value = a_value;
            i_max = i;
        }
    }
}

Upvotes: 0

Related Questions