Umut Tabak
Umut Tabak

Reputation: 1942

program crashing under gdb but runs fine under valgrind

My program on the command line is crashing with a segmentation fault.

I limited the stack size and ran the program again and generated the core file to investigate the location of the problem. Then under gdb I tried to investigate the reason but that does not help that much since the code seems to crash at an external library call, this is a Fortran library which I am calling the C interface from C++.

However the same program runs without crashing under valgrid. I guess I have a memory leak somewhere but maybe someone else also experienced this before. But if I had a memory leak, valgrid should have detected it as well.

Any ideas/help appreciated.

EDIT

After the comments from Employed Russian,

I did 'ulimit -c unlimited' to generate the core file. And fired gdb with this core file as

gdb ./my_binary core

and I got

GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1...done.

warning: core file may not match specified executable file.
[New LWP 18769]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./symmetric_level1'.
Program terminated with signal 11, Segmentation fault.
#0  _int_malloc (av=0x2af4a23cf720, bytes=4600) at malloc.c:3865
3865    malloc.c: No such file or directory.
(gdb) x/i $pc
=> 0x2af4a2096c2b <_int_malloc+2363>:   mov    %rdx,(%rax,%rdx,1)
(gdb) where
#0  _int_malloc (av=0x2af4a23cf720, bytes=4600) at malloc.c:3865
#1  0x00002af4a2098f95 in __GI___libc_malloc (bytes=4600) at malloc.c:2924
#2  0x00002af49db3459f in for__spec_align_alloc () from /home/utabak/external_libraries/intel/composer_xe_2013.0.079/compiler/lib/intel64/libifcore.so.5
#3  0x00002af49db34411 in for_allocate () from /home/utabak/external_libraries/intel/composer_xe_2013.0.079/compiler/lib/intel64/libifcore.so.5
#4  0x00000000004c21a4 in dmumps_301_ ()
#5  0x0000000000516e6c in dmumps_ ()
#6  0x000000000043cf91 in dmumps_f77_ ()
#7  0x000000000041beb9 in dmumps_c ()
#8  0x0000000000410dac in Solvers::LinearSolverMumps::solve (this=0x3, B=..., X=...) at /home/utabak/thesis/C++_Repos/vibrosys/src/LinearSolverMumps.cc:377
#9  0x000000000040cdb5 in main (argc=3, argv=0xe644f0) at symmetric_level1.cc:253

Update with Valgrind output

*dmumps_c is the function that is being called*

==29798== Conditional jump or move depends on uninitialised value(s)
==29798==    at 0x41BA25: dmumps_c (in /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1)
==29798==    by 0x411066: Solvers::LinearSolverMumps::~LinearSolverMumps() (LinearSolverMumps.cc:420)
==29798==    by 0x40D676: main (symmetric_level1.cc:329)
==29798== 
==29798== Use of uninitialised value of size 8
==29798==    at 0x41BA2C: dmumps_c (in /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1)
==29798==    by 0x411066: Solvers::LinearSolverMumps::~LinearSolverMumps() (LinearSolverMumps.cc:420)
==29798==    by 0x40D676: main (symmetric_level1.cc:329)
==29798== 
==29798== Use of uninitialised value of size 8
==29798==    at 0x41BA35: dmumps_c (in /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1)
==29798==    by 0x411066: Solvers::LinearSolverMumps::~LinearSolverMumps() (LinearSolverMumps.cc:420)
==29798==    by 0x40D676: main (symmetric_level1.cc:329)
==29798== 
==29798== Conditional jump or move depends on uninitialised value(s)
==29798==    at 0x41BA42: dmumps_c (in /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1)
==29798==    by 0x411066: Solvers::LinearSolverMumps::~LinearSolverMumps() (LinearSolverMumps.cc:420)
==29798==    by 0x40D676: main (symmetric_level1.cc:329)
==29798== 
==29798== Use of uninitialised value of size 8
==29798==    at 0x41BA42: dmumps_c (in /home/utabak/thesis/C++/numericTests/symmetric_solver_test/symmetric_level1)
==29798==    by 0x411066: Solvers::LinearSolverMumps::~LinearSolverMumps() (LinearSolverMumps.cc:420)
==29798==    by 0x40D676: main (symmetric_level1.cc:329)
==29798== 
==29798== 
==29798== HEAP SUMMARY:
==29798==     in use at exit: 3,289,886 bytes in 16 blocks
==29798==   total heap usage: 294,609 allocs, 294,593 frees, 864,048,760 bytes allocated
==29798== 
==29798== LEAK SUMMARY:
==29798==    definitely lost: 0 bytes in 0 blocks
==29798==    indirectly lost: 0 bytes in 0 blocks
==29798==      possibly lost: 0 bytes in 0 blocks
==29798==    still reachable: 3,289,886 bytes in 16 blocks
==29798==         suppressed: 0 bytes in 0 blocks
==29798== Reachable blocks (those to which a pointer was found) are not shown.
==29798== To see them, rerun with: --leak-check=full --show-reachable=yes
==29798== 
==29798== For counts of detected and suppressed errors, rerun with: -v
==29798== Use --track-origins=yes to see where uninitialised values come from
==29798== ERROR SUMMARY: 18220 errors from 158 contexts (suppressed: 2 from 2)

Upvotes: 2

Views: 4638

Answers (1)

Employed Russian
Employed Russian

Reputation: 213516

I limited the stack size and ran the program again and generated the core file to investigate the location of the problem.

If your new limited stack is too small, you may have introduced an entirely new and unrelated problem.

crash at an external library call

A too small stack size could easily do that.

However the same program runs without crashing under valgrid.

Does Valgrind report any errors? If yes, fix them. If not, note that Valgrind may change execution environment or timing sufficiently to hide the problem.

I guess I have a memory leak somewhere

Memory leaks rarely cause a program to crash.

Your approach to debugging this problem appears to be all wrong. You seem to be "debugging by coincidence".

Instead of changing "random" parts of the problem, try to understand it. First step: try to get program to create a core file without changing anything else (except setting ulimit -c unlimited).

Once you have a core, look where the crash is. If you can't understand what you are seeing, then ask again, but provide details (gdb output of where and x/i $pc commands would help).

Update

You are crashing inside malloc implementation. This is an almost sure sign of heap corruption (double-free, freeing non-malloced, overflowing heap allocation, etc.)

Valgrind should catch this bug and point you straight at it.

Upvotes: 2

Related Questions