jvriesem
jvriesem

Reputation: 1972

Erratic behavior with compiled legacy code using ifort

How I wish I had a minimum working example for this!

I'm doing a bunch of linear algebra using the HSL libraries. I've turned on every debugging flag I can think of.

On my workstation, the final result of my "deterministic" code rarely works. Most of the time, it complains of an indexing error:

On my workstation (Mac OS 10.7.5 and ifort 12):

forrtl: severe (408): fort: (3): Subscript #1 of the array W has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
libintlc.dylib     0000000103C83E04  Unknown               Unknown  Unknown
libintlc.dylib     0000000103C8259E  Unknown               Unknown  Unknown
libifcore.dylib    00000001031FBDA1  Unknown               Unknown  Unknown
libifcore.dylib    000000010316BA4E  Unknown               Unknown  Unknown
libifcore.dylib    000000010316BFB3  Unknown               Unknown  Unknown

On my laptop (Mac OS 10.10.5 and ifort 16):

forrtl: severe (408): fort: (3): Subscript #1 of the array A has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
libifcore.dylib    000000010ABDCC96  Unknown               Unknown  Unknown
Uniform2DSimplifi  00000001068851EE  _ma48bd_                 1461  ma48d.f
Uniform2DSimplifi  000000010693619C  _solve_sparse_mat         142  solve_sparse_matrix_d.f90
Uniform2DSimplifi  000000010693A7D8  _scale_and_solve_         128  scale_and_solve_sparse_matrix_d.f90
Uniform2DSimplifi  000000010685740D  _calc_simplified_         598  calc_simplified_equations_B.f90
Uniform2DSimplifi  0000000106832176  _MAIN__                   161  uniform_2D_simplified_B.f90
Uniform2DSimplifi  000000010683175E  Unknown               Unknown  Unknown

(You may notice that these are actually two different errors, even though I haven't changed a line of code between them.)

My code runs successfully ~70% of the time using the newer version of ifort on my laptop, but only ~20% of the time using the older version of ifort on my workstation. Oddly, the times that it does work are often after a fresh compilation, and after working one time, it gives that error every time after that. One time it worked, didn't work the second time, then worked the third time. (Sometimes on my laptop, it works for the first 2-3 runs, but throws an error the fourth time.)

My own code is entirely deterministic: it's setting up solving linear algebra routines. It also calls the HSL routines, which evidently call MKL. I would assume that both HSL and MKL are deterministic -- that is, identical inputs produce identical outputs. (They don't call RAND() or do file I/O....) Still, I'm not sure.

Update:

I looked up line 1461 of ma48d.f:

     CALL MA50BD(NR,NC,NZB,JOB5,A(K+1),IRN(K+1),KEEP(IPTRD+J1),
 +               CNTL5,ICNTL5,IW(J1),IQB,NP,LA-NEWNE-KK,
 +               A(NEWNE+KK+1),IRN(NEWNE+KK+1),KEEP(IPTRL+J1),
 +               KEEP(IPTRU+J1),W,IW(M+1),INFO5,RINFO5)

On my laptop, it's complaining because k has a value -1 (causing the error) while it normally has a value of 0 (leading to success). What's bizarre about this is that I'm giving these routines the exact same inputs, and the code appears to be deterministic, so they should execute the exact same lines...yet they don't.

My question:

What could be causing this erratic behavior?

So far, I've thought of the following possibilities:

Upvotes: 1

Views: 325

Answers (2)

wallyk
wallyk

Reputation: 57804

In my experience, compilers are far more reliable than programmers. That is, I would suspect the program of having a programming error unless it can be proved that bad code was generated.

This kind of error is certainly due to using an uninitialized value. Look for a variable which is not specifically set to some value before being used.

program x
integer :: i, arr(10)

do while (i < 10)
   arr (i) = 0
   i = i + 1
enddo

print *, arr
end

Sometimes this code will set all the elements to zero. Other times it won't change a thing.


A directly related, but more subtle lack-of-initialization error occurs in this logic:

program y
integer :: sum, i, arrA(10), arrB(10)
real :: ave(2)

do i = 1, 10
    arrA(i) = i * 343
    arrB(i) = i * 121
enddo

sum = 0
do i = 1, 10
    sum = sum + arrA(i)
enddo
ave(0) = sum / 10.0

do i = 1, 10
    sum = sum + arrB(i)
enddo
ave(1) = sum / 10.0

print *, 'Averages are', ave
end

No compiler warning will show up for failing to reinitialize sum, though this sort of error is reproducible and deterministic.

Upvotes: 3

pch
pch

Reputation: 72

I cannot add a comment - hence the answer.

You can also try -ftrapuv (initialize stack variables to an unusual value). If you are using Intel 15 or higher you can set -init=snan. This initializes 'save'd variables to signal NaN.

Upvotes: 1

Related Questions