Juan Fernandez Sosa
Juan Fernandez Sosa

Reputation: 570

OpenMP poor performance

I have this issue, I wrote an OpenMP program which has to calculate the products of m matrices. I want to give to each thread N rows to process.

This is my code:

double val;
    omp_set_num_threads(4);
    for(i=0;i<m;i++){
        #pragma omp parallel for private(f,c,k)
        for(f=0;f<N;f++){ //cada thread trabaja con sus 2 filas asignadas
            //printf("Thread %d, fila %d matriz %d \n",omp_get_thread_num(),f,i);
            for(c=0;c<N;c++){ //cada fila trabaja con todas las columnas de la matriz principal
                val=0;
                for(k=0;k<N;k++){
                    /*if(k==0){
                        AUX[f*N+c]=RES[f*N+k]*A[i][k*N+c];
                    }*/
                    //else{
                        AUX[f*N+c]=val+RES[f*N+k]*A[i][k*N+c];
                    val=AUX[f*N+c];

                    //}
                }
            }
           for(c=0;c<N;c++){
                RES[f*N+c]=AUX[f*N+c];
            }
        }
    } 

The result is OK, but in performance a sequential algorithm is better...

I also made a Pthread solution and it works fine so I think I have some mistake when I parallelized the solution...

Upvotes: 1

Views: 371

Answers (1)

Juan Fernandez Sosa
Juan Fernandez Sosa

Reputation: 570

I found a solution!, first, I didn't pay attention to the way I stored the data into the matrixes, and I had a lot of cache fails. So the RES Matrix stored by rows and the others by column.

Also, i put private the "val" variable. and the performance was improved.

Upvotes: 1

Related Questions