openmp parallel performance

Question

I'm trying to implement the distance matrix in parallel using openmp in which I calculate the distance between each point and all the other points, so the best algorithm I thought of till now cost O(n^2) and the performance of my algorithm using openmp using 10 thread on 8processor machine isn't better than the serial approach in terms of running time, so I wonder if there is any mistake in my implementation on the openmp approach as this is my first time to use openmp, so please if there is any mistake in my apporach or any better "faster" approach please let me know. The following is my code where "dat" is a vector that contains the data points.

map  > dist;   //construct the distance matrix 

int c=count(dat.at(0).begin(),dat.at(0).end(),delm)+1;

#pragma omp parallel for shared (c,dist)

for(int p=0;p

ildjarn · Accepted Answer

#pragma omp critical has the effect of serializing your loop so getting rid of that should be your first goal. This should be a step in the right direction:

ptrdiff_t const c = count(dat[0].begin(), dat[0].end(), delm) + 1;
vector > dist(dat.size(), vector(dat.size()));

#pragma omp parallel for
for (size_t p = 0; p != dat.size(); ++p)
{
  for (size_t j = p + 1; j != dat.size(); ++j)
  {
    double ecl = 0.0;
    string line1 = dat[p];
    string line2 = dat[j];
    for (ptrdiff_t i = 0; i != c; ++i)
    {
      double const num1 = atof(line1.substr(0, line1.find_first_of(delm)).c_str());
      double const num2 = atof(line2.substr(0, line2.find_first_of(delm)).c_str());

      line1 = line1.substr(line1.find_first_of(delm) + 1);
      line2 = line2.substr(line2.find_first_of(delm) + 1);
      ecl += (num1 - num2) * (num1 - num2);
    }

    ecl = sqrt(ecl);
    dist[p][j] = ecl;
    dist[j][p] = ecl;
  }
}

There are a few other obvious things that could be done to make this faster overall, but fixing your parallelization is the most important thing.

openmp parallel performance

Answers (2)

Related Questions