Nathan Folly
Nathan Folly

Reputation: 37

inconsistent: process abort or segmentation fault or running fine when calling LAPACK's SGESV subroutine

I am getting strange reactions to my Fortran95 code from my machine and do not know what is going wrong. Here's the situation:

I am trying to get acquainted with LAPACK and wrote a shamefully simple 1-D "FEM" program just to see how to use LAPACK:

program bla
! Solving the easiest of all FE static cases: one-dimensional, axially loaded elastic rod. composed of 2-noded elements



implicit none


integer :: nelem, nnodes, i,j, info
real, parameter :: E=2.1E9, crossec=19.634375E-6, L=1., F=10E3
real :: initelemL
real, allocatable :: A(:,:)
real, allocatable :: b(:), u(:)
integer, allocatable :: ipiv(:)



print *,'Number of elements?'
read *,nelem

nnodes=nelem+1


allocate(A(nnodes,nnodes),u(nnodes),b(nnodes), ipiv(nnodes))
initelemL=L/nelem



A(1,1)=1
do i=2, nnodes
A(1,i)=0
end do

do i=2,nnodes
do j=1,nnodes
A(i,j)=0
end do
A(i,i)=1
A(i,i-1)=-1
end do

b(1)=0   !That's the BC of zero-displacement of the first node
do i=2,nnodes
b(i)=((F/crossec)/E +1)*initelemL
end do



!calling the LAPACK subroutine:


call SGESV(nnodes,nnodes, A, nnodes, ipiv, b, nnodes, info)
print *,info
print *,b

end program bla

I'm on a mac so in order to include LAPACK i compile with: gfortran -fbacktrace -g -Wall -Wextra -framework accelerate bla.f95

without a warning.

When I run the code, the strange things happen:

If I put in 2 as number of elements, I get the answer "b" as expected.

If I put in 5, I am getting a segmentation fault:

Number of elements?
5    

           0

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x10bd3eff6
#1  0x10bd3e593
#2  0x7fff98001f19
#3  0x7fff93087d62
#4  0x7fff93085dd1
#5  0x7fff930847e3
#6  0x7fff93084666
#7  0x7fff93083186
#8  0x7fff9696c63f
#9  0x7fff96969393
#10  0x7fff969693b4
#11  0x7fff9696967a
#12  0x7fff96991bb2
#13  0x7fff969ba80e
#14  0x7fff9699efb4
#15  0x7fff9699f013
#16  0x7fff9698f3b9
#17  0x10bdc7cee
#18  0x10bdc8fd6
#19  0x10bdc9936
#20  0x10bdc0f42
#21  0x10bd36c40
#22  0x10bd36d20
Segmentation fault: 11

If I put in 50, I get the answer but THEN the program fails although there is nothing more to do for it:

Number of elements?
50
           0
   0.00000000       2.48505771E-02   4.97011542E-02   7.45517313E-02   9.94023085E-02  0.124252886      0.149103463      0.173954040      0.198804617      0.223655194      0.248505771      0.273356348      0.298206925      0.323057503      0.347908080      0.372758657      0.397609234      0.422459811      0.447310388      0.472160965      0.497011542      0.521862149      0.546712756      0.571563363      0.596413970      0.621264577      0.646115184      0.670965791      0.695816398      0.720667005      0.745517612      0.770368218      0.795218825      0.820069432      0.844920039      0.869770646      0.894621253      0.919471860      0.944322467      0.969173074      0.994023681       1.01887429       1.04372489       1.06857550       1.09342611       1.11827672       1.14312732       1.16797793       1.19282854       1.21767914       1.24252975    
a.out(1070,0x7fff7aa05300) malloc: *** error for object 0x7fbdd9406028: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0x106c61ff6
#1  0x106c61593
#2  0x7fff98001f19
Abort trap: 6

This is reproducible. But: if I put another 'print' statement somewhere in the code, the numbers (here: 2,5,50) change. I'm probably committing a rookie mistake here but I am currently feeling rather helpless as it works sometimes and I am not sure how to interpret the Backtrace.

My ideas are currently:

  1. Some really stupid mistake in using SGESV
  2. LAPACK library somehow broken
  3. Some hardware issue with my memory.

Has anybody experienced something like this before and could offer any advice on what is going?

Thanks in advance, cheers,

N.F.

Upvotes: 0

Views: 246

Answers (1)

Mike Kinghan
Mike Kinghan

Reputation: 61610

You are defining b as a 1-dimensional real array and allocating it on the heap. You are passing it as the 6th parameter to SGESV.

The documentation of SGESV defines the 6th parameter as a 2-dimensional real array:

\param[in,out] B
\verbatim
          B is REAL array, dimension (LDB,NRHS)
          On entry, the N-by-NRHS matrix of right hand side matrix B.
          On exit, if INFO = 0, the N-by-NRHS solution matrix X.
\endverbatim

SGESV consequently writes into memory locations that it believes to lie in a 2D array addressed by b when in fact they might, luckily, happen to lie in the 1D array you have actually passed, or unluckily in who-knows-what other part of your program's memory layout. So it's vandalizing itself. The damage done will be unpredictable and will vary according to your input parameter, which determines the expected and actual size of the mis-allocated array.

Comparing your code with the documented interface of SGESV, you appear to be confusing the parameters B and IPIV.

Upvotes: 1

Related Questions