user76284
user76284

Reputation: 1328

Possible race condition in OpenMP parallel for loop

I'm using OpenMP to parallelize a for loop. This program uses the Gurobi C API.

GRBupdatemodel(model);

#pragma omp parallel for
for (int batch = 0; batch < n_batch; ++batch)
{

    int cind[1 + n_max];
    for (int j = 0; j < n_max; ++j)
    {
        cind[1 + j] = n_sum + j;
    }

    double cval[1 + n_max];
    cval[0] = 1;

    GRBmodel* model_copy = GRBcopymodel(model);
    for (int i = 0; i < n_sum; ++i)
    {
        cind[0] = i;
        for (int k = 0; k < n_min; ++k)
        {
            for (int j = 0; j < n_max; ++j) 
            {
                cval[1 + j] = -*((double*) PyArray_GetPtr(array, (long []) {batch, i, j, k}));
            }
            GRBaddconstr(model_copy, 1 + n_max, cind, cval, GRB_LESS_EQUAL, 0, NULL);
        }
    }
    GRBoptimize(model_copy);
    GRBgetdblattrarray(model_copy, GRB_DBL_ATTR_X, 0, n_sum, values[batch]);
    GRBgetdblattrarray(model_copy, GRB_DBL_ATTR_X, n_sum, n_max, max_strategy[batch]);
    GRBgetdblattrarray(model_copy, GRB_DBL_ATTR_PI, 1, n_sum * n_min, (double *) min_strategies[batch]);
    GRBfreemodel(model_copy);
}

Each iteration of this loop writes to a different section of the arrays values, max_strategy, and min_strategies.

Removing #pragma omp parallel for yields correct results. Adding it yields incorrect results, including nan values. Thus I suspect there is a race condition in my loop, but I haven't been able to find it. Does anyone know what might be going wrong? I only see two types of writes in my loop body:

Upvotes: 1

Views: 430

Answers (1)

Max Linke
Max Linke

Reputation: 1735

https://www.gurobi.com/documentation/8.1/refman/py_env2.html

According to the docs the models in each loop iteration are not independent copies. They share the same environment variable. This means they potentially read/write from the same memory. If you ensure each copy has its own environment it might succeed.

I see you are calling this code from python. If you only use C is to use multithreaded computation I would recommend using pure python code with joblib or dask to have the same result. The race condition would also be apparent in python as the different models cannot be serialized. See this support question

Upvotes: 2

Related Questions