Noz.i.kor
Noz.i.kor

Reputation: 3747

floating point exception in openmp parallel for

One of the files in my project has a for loop that I tried to parallelize using OpenMP for. When I ran it, I got a floating point exception. I couldn't reproduce the error in a separate test program, however, I could reproduce it in the same file using a dummy parallel region (the original for loop had some detailed array computations, hence the dummy code):

#pragma omp parallel for
for(i=0; i<8; i++)
{
  puts("hello world");
}

I still got the same error. Heres the gdb output:

    Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7ffff4c44710 (LWP 18912)]
0x0000000000402fd4 in allocate_2D_matrix.omp_fn.0 (.omp_data_i=0x0) at main.c:119
119     #pragma omp parallel for

By trial-and-error, I solved the problem by adding a schedule to the openmp construct:

#pragma omp parallel for schedule(dynamic)
    for(i=0; i<8; i++)
    {
      puts("hello world");
    }

and it worked just fine. I could replicate this entire behaviour on 2 different systems (gcc 4.4.5 on 64 bit Linux Mint and gcc 4.5.0 on 64 bit Opensuse). Would anyone have any ideas as to what might have caused it? I strongly suspect it is related to my program, since I couldn't reproduce the error separately, but I dont know where to look at. The problem is solved of course, but I am curious. If need be, I can post the entire original function where I see this behaviour.

Upvotes: 0

Views: 2007

Answers (2)

Bruno Levy
Bruno Levy

Reputation: 11

I had the same issue, it seems to happen when using unsigned ints as loop iteration variables, here is an example that has the problem and the fix:

/* the following code was generating a FPE: */

unsigned int m = A->m ;
unsigned int i,ij ;
NLCoeff* c = NULL ;
NLRowColumn* Ri = NULL;

#pragma omp parallel for private(i,ij,c,Ri) 
for(i=0; i<m; i++) {
    Ri = &(A->row[i]) ;       
    y[i] = 0 ;
    for(ij=0; ij<Ri->size; ij++) {
        c = &(Ri->coeff[ij]) ;
        y[i] += c->value * x[c->index] ;
    }
}

/* and this one does not: */

int m = (int)(A->m) ;
int i,ij ;
NLCoeff* c = NULL ;
NLRowColumn* Ri = NULL;

#pragma omp parallel for private(i,ij,c,Ri) 
for(i=0; i<m; i++) {
    Ri = &(A->row[i]) ;       
    y[i] = 0 ;
    for(ij=0; ij<(int)(Ri->size); ij++) {
        c = &(Ri->coeff[ij]) ;
        y[i] += c->value * x[c->index] ;
    }
}

Upvotes: 1

Anycorn
Anycorn

Reputation: 51505

most likely puts isnt thread safe. Stick it in critical section and see what happens.

Upvotes: 1

Related Questions