Reputation: 31
I am working on parallelizing the Conjugate gradient method to solve sparse matrices. The CG method calls a subroutine "matrixVectorProduct()" in the algorithm. I am trying to parallelise this subroutine specifically. The following is the code I am working with for SYMMETRIC matrices stored in CSR format.
void matrixVectorProduct(MTX *MAT, double* inVec, double* outVec){
int i,j, ckey;
if((matcode[1] == 'X')&&(matcode[3] == 'S'))
{
//Initialize outVec to zeros
for(int j=0;j<MAT->nrows;j++)
outVec[j] = 0.0;
for(i=0;i<MAT->nrows;i++)
for(ckey=MAT->row_ptr[i];ckey<MAT->row_ptr[i+1];ckey++) {
j = MAT->JA[ckey];
outVec[i] = outVec[i] + MAT->val[ckey] * inVec[j];
}
for(int i=0;i<MAT->nrows;i++)
for(int ckey=MAT->row_ptr[i];ckey<MAT->row_ptr[i+1];ckey++) {
j = MAT->JA[ckey];
if(j!=i)
outVec[j] += MAT->val[ckey] * inVec[i];;
}
}
else
{
fprintf(stderr,"\n Not a symmetric Matrix. CG method not applicable\n");
exit(1);
}
return;}
My code after parallelization is:
void matrixVectorProduct(MTX *MAT, double* inVec, double* outVec){
int i,j, ckey;
if((matcode[1] == 'X')&&(matcode[3] == 'S'))
{
//Initialize outVec to zeros
for(int j=0;j<MAT->nrows;j++)
outVec[j] = 0.0;
#pragma omp parallel
{
#pragma omp for private(ckey,j) schedule(static)
for(i=0;i<MAT->nrows;i++) {
double zi = 0.0;
for(ckey=MAT->row_ptr[i];ckey<MAT->row_ptr[i+1];ckey++) {
j = MAT->JA[ckey];
zi = zi + MAT->val[ckey] * inVec[j];
}
outVec[i] += zi;
}
}
#pragma omp parallel
{
#pragma omp for private(ckey,j) schedule(static)
for(int i=0;i<MAT->nrows;i++)
for(int ckey=MAT->row_ptr[i];ckey<MAT->row_ptr[i+1];ckey++) {
j = MAT->JA[ckey];
if(j!=i)
outVec[j] += MAT->val[ckey] * inVec[i];;
}
}
}
else
{
fprintf(stderr,"\n Not a symmetric Matrix. CG method not applicable\n");
exit(1);
}
return;
}
The first omp pragma loop is working as desired. But there seems to be a problem when i parallelize the second loop similarly. It is not giving the correct output.
Can someone please direct me as to what I am doing wrong in the second pragma loop. I am a newbie to multithreading and open MP.
Thanks.
Upvotes: 3
Views: 508
Reputation: 23537
There is a data race in your second loop since multiple threads can update the same vector element. Use:
if (j != i) {
#pragma omp atomic
outVec[j] += MAT->val[ckey] * inVec[i];
}
which ensures atomicity of updates.
Upvotes: 1