Duncan McFarlane
Duncan McFarlane

Reputation: 85

reduction with string type in OpenMP

I am use OpenMP to parallize a for loop like so

std::stringType = "somevalue";
#pragma omp parallel for reduction(+ : stringType)
//a for loop here which every loop appends a string to stringType

The only way I can think to do this is to convert to an int representation in some way first and then convert back at the end but this has obvious overhead. Is there any better ways to perform this style of operation?

Upvotes: 2

Views: 845

Answers (1)

Brice
Brice

Reputation: 1580

As mentioned in comments, reduction assumes that the operation is associative and commutative. The values may be computed in any order and be "accumulated" through any kind of partial results and the final result will be the same.

There is no guarantee that an OpenMP for loop will distribute contiguous iterations to each thread unless the loop schedule explicitly requests that. There is no guarantee either that continuous blocks will be distributed by increasing thread number (i.e. thread #0 might go through iterations 1000-1999 while thread #1 goes through 0-999). If you need that behavior, then you should define you own schedule.

Something like:

int N=1000;
std::string globalString("initial value");

#pragma omp parallel shared(N,stringType)
{
    std::string localString; //Empty string

    // Set schedule
    int iterTo, iterFrom;
    iterFrom = omp_get_thread_num() * (N / omp_get_num_threads());
    if (omp_get_num_threads() == omp_get_thread_num()+1)
        iterTo =  N;
    else
        iterTo = (1+omp_get_thread_num()) * (N / omp_get_num_threads());

    // Loop - concatenate a number of neighboring values in the right order
    // No #pragma omp for: each thread goes through the loop, but loop
    // boundaries change according to the thread ID
    for (int ii=iterTo; ii<iterTo ; ii++){
        localString += get_some_string(ii);
    }

    // Dirty trick to concatenate strings from all threads in the good order
    for (int ii=0;ii<omp_get_num_threads();ii++){
        #pragma omp barrier
        if (ii==omp_get_thread_num())
            globalString += localString;
    }

}

A better way would be to have a shared array of std::string, each thread using one as a local accumulator. At the end, a single thread can run the concatenation part (and avoid the dirty trick and all its overhead-heavy barrier calls).

Upvotes: 3

Related Questions