Petras
Petras

Reputation: 183

C++17 for_each parallel

I would like to ask how can I parallelize the following loop. It is currently crashing. I tried to search and most answers show that the problem is that I am using std::vector. I tried to make a fixed-sized std::vector. But the application still crashes. Could you tell me what is wrong in the following loop?

    std::vector<int> a(pairsListFlags.size());
    std::generate(a.begin(), a.end(), [n = 0]() mutable { return n++; });
    

    std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int i) {

        int a = pairsList[i * 2];
        int b = pairsList[i * 2 + 1];
        if (getCollisionOpenNurbs(OBB[a], OBB[b])) {   //Check OBB collision +4-6 ms  
            if (FaceFace(P[a], P[b], Pl[a], Pl[b])) {//Check polygon intersection +20 ms
                pairsListFlags[i] = 1;
                
            }
        }

    });

Upvotes: 0

Views: 1157

Answers (1)

J&#233;r&#244;me Richard
J&#233;r&#244;me Richard

Reputation: 50338

Your problem is not embarrassingly parallel and so you should not use std::for_each here (at least not without synchronisation mechanisms or atomics which would be inefficient). Instead, you can perform a reduction using std::reduce. Here is an example:

std::vector<int> a(pairsListFlags.size());
std::generate(a.begin(), a.end(), [n = 0]() mutable { return n++; });
int counter = std::reduce(std::execution::par_unseq, std::begin(a), std::end(a), 0, [&](int i) {
    int a = pairsList[i * 2];
    int b = pairsList[i * 2 + 1];
    if (getCollisionOpenNurbs(OBB[a], OBB[b])) {   //Check OBB collision +4-6 ms  
        if (FaceFace(P[a], P[b], Pl[a], Pl[b])) {  //Check polygon intersection +20 ms
            pairsListFlags[i] = 1;
            return 1;
        }
    }
    return 0;
});

Note that you should be careful about false sharing on pairsListFlags since it can decrease a bit the performance of the resulting code (but have no impact on the result).

Upvotes: 2

Related Questions