user3616359
user3616359

Reputation: 399

Valgrind+gdb debugging with MPI, error in library?

I am having problem with gdb+valgrind debugging. I run valgrind with vgdb option and then in another session gdb with target remote command. However, it seems that there are the errors at the beginning with initialization MPI. I get these types of errors:

warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_btl_ofud.so": Invalid operation <br/>
warning: cannot close "/lib64/libosmcomp.so.3": Invalid operation <br/>
warning: cannot close "/lib64/librdmacm.so.1": Invalid operation <br/>
warning: cannot close "/lib64/libibverbs.so.1": Invalid operation <br/>
warning: cannot close "/lib64/libibumad.so.3": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_btl_openib.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_bfo.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_csum.so": Invalid operation <br/>
warning: cannot close "/usr/lib64/openmpi/lib/openmpi/mca_pml_v.so": Invalid operation

Then I get error:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000007950277 in __libc_writev (fd=7, vector=0x9a40f90, count=3) at ../sysdeps/unix/sysv/linux/writev.c:50
c50         
result = INLINE_SYSCALL (writev, 3, fd, CHECK_N (vector, count), count); 

The problem is that after I press continue, on the screen I get message "Continuing.", but it seems that program is not executing any more. Before I got these errors in MPI library (PMPI_Init (in /usr/lib64/openmpi/lib/libmpi.so.1.0.6) ), which are reported by valgrind, I couldn't inspect error with gdb, I would constantly get:

Cannot access memory at address 0x39 
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-1.fc18.x86_64 krb5-libs-1.10.3-17.fc18.x86_64 libcom_err-1.42.5-1.fc18.x86_64 libesmtp-1.0.6-4.fc18.x86_64 libselinux-2.1.12-7.3.fc18.x86_64 openssl-libs-1.0.1e-37.fc18.x86_64 pcre-8.31-5.fc18.x86_64

It seems that there is an error in mpi library, but since I am not proficient user of gdb, I am not 100% sure. Is there any suggestion what might be wrong?
Thanks in advance!

Upvotes: 0

Views: 503

Answers (1)

Kam
Kam

Reputation: 6008

First of all why are you trying to use gdb and valgrind together? Find you bug using gdb, then find your memory leaks using valgrind after you've fixed your bugs.

Regarding GDB and signals. GDB will catch all signals before they get to your application.

So If your application should not be receiving signals then you'd need to figure out why it is receiving one.

However you can ask gdb to ignore all signals, like so:

gdb -p $prodid -x $file

>> cat file
>> handle SIGUSR1 nostop
   continue 

Upvotes: 1

Related Questions