Allen Zhang
Allen Zhang

Reputation: 121

how to debug the error of "Program received signal SIGSEGV: Segmentation fault"

I am running a Fortran exe, and I get the error:

 set_nml_output Echo NML values to log file only
 Trying to open namelist log dart_log.nml
 PE 0: initialize_mpi_utilities:  Running with            8  MPI processes.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Then I try to use gdb to find someting, it reports

[New LWP 9883]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Failed to read a valid object file image from memory.
Core was generated by `./filter'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00002af8e021390c in netcdf::nf90_open (
    path=<error reading variable: value requires 57959040 bytes, which is more than max-value-size>, mode=0, 
    ncid=<error reading variable: Cannot access memory at address 0x7ffe439346b0>, 
    chunksize=<error reading variable: Cannot access memory at address 0x0>, 
    cache_size=<error reading variable: Cannot access memory at address 0x7ffe43934530>, 
    cache_nelems=<error reading variable: Cannot access memory at address 0x7ffe43934528>, 
    cache_preemption=<error reading variable: Cannot access memory at address 0x7ffe439345a0>, 
---Type <return> to continue, or q <return> to quit---
    comm=<error reading variable: Cannot access memory at address 0x7ffe439345a8>, 
    info=<error reading variable: Cannot access memory at address 0x7ffe439345b0>, 
    _path=<error reading variable: Cannot access memory at address 0x7ffe439345b8>) at netcdf4_file.f90:39
39  netcdf4_file.f90: No such file or directory.
(gdb) bt
#0  0x00002af8e021390c in netcdf::nf90_open (
    path=<error reading variable: value requires 57959040 bytes, which is more than max-value-size>, mode=0, 
    ncid=<error reading variable: Cannot access memory at address 0x7ffe439346b0>, 
    chunksize=<error reading variable: Cannot access memory at address 0x0>, 
    cache_size=<error reading variable: Cannot access memory at address 0x7ffe43934530>, 
    cache_nelems=<error reading variable: Cannot access memory at address 0x7ffe43934528>, 
    cache_preemption=<error reading variable: Cannot access memory at address 0x7ffe439345a0>, 
    comm=<error reading variable: Cannot access memory at address 0x7ffe439345a8>, 
    info=<error reading variable: Cannot access memory at address 0x7ffe439345b0>, 
    _path=<error reading variable: Cannot access memory at address 0x7ffe439345b8>) at netcdf4_file.f90:39
Backtrace stopped: Cannot access memory at address 0x7ffe43934598

and the netcdf4_file.f90:39 is shown as follows:

if (present(cache_size) .or. present(cache_nelems) .or. &
       present(cache_preemption)) then
     ret = nf_get_chunk_cache(size_in, nelems_in, preemption_in)
     if (ret .ne. nf90_noerr) then
        nf90_open = ret
        return
     end if
     if (present(cache_size)) then
        size_out = cache_size     #### line 39
     else
        size_out = size_in
     end if
     if (present(cache_nelems)) then
        nelems_out = cache_nelems
     else
        nelems_out = nelems_in
     end if

Does the version of netcdf is related with the problem, or some setting should be modified?

Can anyone give me some suggestion of how to fix this problem, since I am not quiet familiar with these. Thanks in advance.

Upvotes: 4

Views: 5275

Answers (1)

chw21
chw21

Reputation: 8140

A segmentation fault is really hard to debug, but here are a few things I do:

Compile with debug symbols and run time checks. The flags are compiler dependent, but here are the ones for gfortran and Intel Fortran:

gfortran     ifort         effect
------------------------------------------------------
-g           -g            Stores the code inside the binary
-O0          -O0           Disables optimisation
-fbacktrace  -traceback    More informative stack trace
-Wall        -warn all     Enable all compile time warnings
-fcheck=all  -check all    Enable run time checks

With a little bit of luck, when your program crashes after being compiled this way, it will be easier to deduce what's going wrong.

Upvotes: 6

Related Questions