Reputation: 153
I have a code that I want to parallelise using OpenMP. The program demonstrates a function that performs polynomial interpolation. First I thought of parallelising a for cycle inside one of the many functions in the code. The serial version of this for goes like this:
void polint (double xa[], double ya[], int n, double x, double *y, double *dy) {
int i, m, ns=1;
double den,dif,dift,ho,hp,w;
double *c, *d;
dif = fabs(x-xa[1]); c = vector(1,n);
d = vector(1,n); for (i=1; i<= n; i++) {
dift = fabs (x - xa[i]);
if (dift<dif) {
ns = i;
dif = dift;
}
c[i] = ya[i];
d[i] = ya[i];
}
*y = ya[ns--];
for (m = 1; m < n; m++) {
for (i = 1; i<= n-m; i++) {
ho = xa[i] - x;
hp = xa[i+m] - x;
w = c[i+1] - d[i]; den = ho - hp;
den = w / den; d[i] = hp * den; c[i] = ho * den;
}
*y += (*dy=(2*ns < (n-m) ? c[ns+1] : d[ns--]));
}
free_vector (d, 1, n); free_vector (c, 1, n);
}
And now, using OpenMP, I built this code:
void polint (double xa[], double ya[], int n, double x, double *y, double *dy) {
int i, m, ns=1;
double den,dif,dift,ho,hp,w;
double *c, *d;
int nth;
dif = fabs(x-xa[1]); c = vector(1,n);
d = vector(1,n); for (i=1; i<= n; i++) {
dift = fabs (x - xa[i]);
if (dift<dif) {
ns = i;
dif = dift;
}
c[i] = ya[i];
d[i] = ya[i];
}
#pragma omp parallel
{
*y = ya[ns--];
nth = omp_get_num_threads();
for (m = 1; m < n; m++) {
#pragma omp for
for (i = 1; i<= n-m; i++) {
#pragma omp critical
{
ho = xa[i] - x;
hp = xa[i+m] - x;
w = c[i+1] - d[i]; den = ho - hp;
den = w / den; d[i] = hp * den; c[i] = ho * den;
}
}
#pragma omp critical
*y += (*dy=(2*ns < (n-m) ? c[ns+1] : d[ns--]));
}
free_vector (d, 1, n); free_vector (c, 1, n);
}
}
The program compiles correctly, but the problem comes when I run it. I get the following error message:
How can I solve this? I first thought that the array was being overstepped, so I added the critical regions so that many processes wouldn't access the array at the same time. However, this didn't solved the problem. Thanks in advance!
Upvotes: 1
Views: 210
Reputation: 9489
Aside from the obvious freeing of the c
and d
pointers which generates the crash you experience, there are plenty of problems in your code. To better point them, I rewrote your sequential function with more sensible variables declarations and a correct indentation. Here is what it gives me:
void polint( double xa[], double ya[], int n, double x, double *y, double *dy ) {
double *c = vector( 1, n );
double *d = vector( 1, n );
double dif = fabs( x - xa[1] );
int ns = 1;
for ( int i = 1; i <= n; i++ ) {
double dift = fabs( x - xa[i] );
if ( dift < dif ) {
ns = i;
dif = dift;
}
c[i] = ya[i];
d[i] = ya[i];
}
*y = ya[ns--];
for ( int m = 1; m < n; m++ ) {
for ( int i = 1; i <= n - m; i++ ) {
double ho = xa[i] - x;
double hp = xa[i + m] - x;
double den = ( c[i + 1] - d[i] ) / ( ho - hp );
d[i] = hp * den;
c[i] = ho * den;
}
*dy = 2 * ns < ( n - m ) ? c[ns + 1] : d[ns--];
*y += *dy;
}
free_vector( d, 1, n );
free_vector( c, 1, n );
}
Now, reading the code becomes simpler (although it remains quite obfuscated) and a few issues become apparent:
1
to n
rather than from 0
to n-1
? If so, that's unusual (and error prone). If not, you probably have some problems in your loops (the first i
loop notably).dy
's value is rewrote at every iteration of the m
loop. Knowing this is an output parameter, this means that it's final value will only be the one corresponding to the last value of m
. It is not an issue per say, but it seems suspicious... Are you sure it is correct?den
and the subsequent update of c
that c[i]
depends of c[i+1]
. And this, from a parallelisation's perspective, is a big problem. This means that the way the code is written right now, it cannot be parallelised. That doesn't close the door of parallelisation altogether, but that just means there will be more work to do than just putting some OpenMP directives to get it.Now, aside from these issues, if we take a look at your parallelised version, there are plenty of other problems:
private
the data that should be: at least, m
, ho
, hp
, w
and den
should be declare private
. Moreover, I strongly suspect that *y
should be declare reduction(+)
and *dy
should be declare lastprivate
(although the exact way of doing that properly would need some finer tuning).private
declarations, you enclosed the full body of your parallel loop into a critical
region, essentially re-serialising it. Unfortunately, it is both super-infective and wrong, since as we saw, the order of iterations matters, and this doesn't preserve it...So in conclusion, I strongly encourage you to do a few things:
i
dependency of c
. That can easily be done by keeping two copies of it (one corresponding to its value at the previous iteration of the m
loop, and one corresponding to its current value).private
etc when adding your OpenMP directives. Good news is that if you did as I advised you and delayed their declaration as much as possible, you will likely have very few of them to look after, since most declarations will be inside the parallel
region, making the corresponding variables automatically private
as they should be.With that, you should be able to get a correct and (hopefully) effective parallel version of your code.
Upvotes: 3