openmp parallel for schedule construct giving different answers for ever few program runs

Question

I am trying to use openmp work sharing constructs. The code shared is a simpler example of what's going wrong with my bigger openmp code. I'm assigning values to an integer matrix, printing the matrix element values, initialising them to 0 and repeating it in a 't' loop. I'm counting the number of times the value assignments (done by parallel for) fail through the integer 'p'. p is supposed to be 0 if the code is correct, but it gives me different answers for different runs, so the work construct is failing somewhere. I had to run it around 12 times before I got the first wrong value of p as output (1, 2, 3, etc.)

The barrier directives in the code aren't really necessary, I was getting different values of p without it and thought an explicit barrier would help but I was wrong. This is the code:

    #define NRA 10                 /* number of rows in matrix A */
    #define NCA 10                 /* number of columns in matrix A */

    int main()
    {
        int i, j, ir, p = 0, t; 
        int *a; 
        a = (int*) malloc(sizeof(int)*NRA*NCA);

        omp_set_num_threads(5);

        for(t=0;t<100000;t++)
        {
            #pragma omp barrier
            #pragma omp parallel for schedule (static,2) collapse(2)
            for(i=0;i



This is the bigger code, and I don't think the race condition is an issue because I declared all variables outside parallel loop shared and all other variables locally inside the parallel loop. Any suggestions would be helpful!

    #define NRA 10                 /* number of rows in matrix A */
    #define NCA 10                 /* number of columns in matrix A */
    #define NCB 10                  /* number of columns in matrix B */

    void matrixcalc (double *ad, double *bd, double *cd, int chunkd);
    void printresults (double *cd, int chunkd);
    void printrep (double *cd, int chunkd);

    int main () 
    {
        int nthreads, chunk, p = 0;
        double *a,*b,*c;   
        a = (double*)malloc(NRA*NCA*sizeof(double)); 
        if(a==NULL) 
            printf("ho
"); 
        b = (double*)malloc(NCA*NCB*sizeof(double));
        c = (double*)malloc(NRA*NCB*sizeof(double));

        omp_set_num_threads(5);

        chunk = 2;                    /* set loop iteration chunk size */
        int ir3, i1, j1;

        /*** Spawn a parallel region explicitly scoping all variables ***/
        int t, tmax = 100000;
        for(t=0;t

Zulan · Accepted Answer

The issue is a race condition on ir. Since it is defined outside of the loop, it is implicitly shared. You could force it to be private, but it is better to declare variables as locally as possible. That makes reasoning about OpenMP code much easier:

#pragma omp parallel for schedule (static,2) collapse(2)
for(int i=0;i



As commented by Jorge Bellón, there are other issues in your code with respect to redundant barriers and efficiency.

openmp parallel for schedule construct giving different answers for ever few program runs

Answers (1)

Related Questions