Eenoku
Eenoku

Reputation: 2977

MPI debugging with GDB - No symbol "i" in current context

I need to debug my MPI application written in C. I wanted to use the system with GDB attached manually to processes, as it's recommended here (paragraph 6).

The problem is, when I try to print the value of the variable "i", I get this error:

No symbol "i" in current context.

The same problem is with set var i=5. When i try to run info local, it simply states "no locales".

I compile my code with the command

mpicc -o hello hello.c

and execute it with

mpiexec -n 2 ./hello

I've tried to look for this problem, but the solution is usually not to use any optimalization (-O) options in GCC, but it's not useful for me, because I don't use any of them here and I'm compiling with MPICC. I've already tried to declare "i" variable as volatile, and launch mpicc with -g and -O0, but nothing helps.


DBG message

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1

Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3778
Reading symbols from /home/martin/Dokumenty/Programovani/mpi_trenink/hello...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpich.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpich.so.10
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /usr/lib/x86_64-linux-gnu/libmpl.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/x86_64-linux-gnu/libmpl.so.1
Reading symbols from /lib/x86_64-linux-gnu/librt.so.1...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/librt-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/librt.so.1
Reading symbols from /usr/lib/libcr.so.0...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libcr.so.0
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libpthread-2.19.so...done.
done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded symbols for /lib/x86_64-linux-gnu/libpthread.so.0
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libgcc_s.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libdl-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libdl.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libnss_files-2.19.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
0x00007f493e53c9a0 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
81  ../sysdeps/unix/syscall-template.S: No such file or directory.

My code

#include <stdio.h>
#include <mpi.h>

#include <unistd.h> // sleep()

int main(){
    MPI_Init(NULL, NULL);
    
    /* DEBUGGING STOP */

    int i = 0;
    while(i == 0){
        sleep(30);
    }

    int world_size;
    MPI_Comm_size( MPI_COMM_WORLD, &world_size );

    int process_id; // casto znaceno jako 'world_rank'
    MPI_Comm_rank( MPI_COMM_WORLD, &process_id );

    char processor_name[ MPI_MAX_PROCESSOR_NAME ];
    int name_len;
    MPI_Get_processor_name( processor_name, &name_len );

    printf("Hello! - sent from process %d running on processor %s.\n\
        Number of processors is %d.\n\
        Length of proc name is %d.\n\
        ***********************\n",
        process_id, processor_name, world_size, name_len);

    MPI_Finalize();
    return 0;
}

Upvotes: 2

Views: 4721

Answers (2)

Eenoku
Eenoku

Reputation: 2977

I've finally solved this. The point is I had to examine the contents of the certain frame with up command, before trying to print the variable "i" up or changing its value.


Step-by-step solution

  1. Compile this code with mpicc -o hello hello.c -g -O0. Launch the program with mpiexec -n 2 ./hello.

  2. Find the process ID (PID) out.

    • I use the command ps -e | grep hello.
    • Other option is to use simply pstree.
    • And finally, you can use the native Linux function getpid().
  3. Next step is to open a new terminal and launch GDB with the command gdb --pid debugged_process_id.

  4. Now, in debugger type bt. The output will be similar to this one:

    #0  0x00007f63667e09a0 in __nanosleep_nocancel ()
    at ../sysdeps/unix/syscall-template.S:81
    #1  0x00007f63667e0854 in __sleep (seconds=0)
    at ../sysdeps/unix/sysv/linux/sleep.c:137
    #2  0x00000000004009ec in main () at hello.c:20
    
  5. As we can see, paragraph 2 points to the code hello.c, so we can look at it more in detail. Type up 2. The output will be similar to this one:

    #2  0x00000000004009ec in main () at hello.c:20
    warning: Source file is more recent than executable.
    20          sleep(30);
    
  6. And finally, now we can print all the local variables in this block out. Type info local. The output:

    i = 0
    world_size = 0
    process_id = 0
    processor_name = "\000\000\000\000\000\000\000\000 5\026gc\177\000\000\200\306Η\377\177\000\000p\306Η\377\177\000\000.N=\366\000\000\000\000\272\005@\000\000\000\000\000\377\377\377\377\000\000\000\000%0`\236\060\000\000\000\250\361rfc\177\000\000x\n\026gc\177\000\000\320\067`\236\060\000\000\000\377\377\377\177\376\377\377\377\001\000\000\000\000\000\000\000\335\n@\000\000\000\000\000\377\377\377\377\377\377\377\377\000\000\000\000\000\000\000"
    name_len = 1718986550
    
  7. Now we can free the stopper loop by set var i=1 and continue with debugging.

Upvotes: 4

Hristo Iliev
Hristo Iliev

Reputation: 74475

With a high probability GDB is to break the process while it is deep into the implementation of the sleep(3) function. You could check that by first issuing the bt (backtrace) command:

(gdb) bt
#0  0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6
#1  0x00000030e0cac8b0 in sleep () from /lib64/libc.so.6
#2  0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9

i is not present in the frame of nanosleep:

(gdb) info locals
No symbol table info available.

Select the stack frame of the main function by issuing the frame x command (where x is the frame number, 2 in the example shown).

(gdb) f 2
#2  0x0000000000400795 in main (argc=1, argv=0x7fff64ae4688) at sleeper.c:9
9          while(i == 0) { sleep(30); }

i should be there now:

(gdb) info locals
i = 0

You might also need to change the active thread if GDB happens to attach to the wrong one. Many MPI libraries spawn additional threads, e.g. with Intel MPI:

(gdb) info threads
  3 Thread 0x7f8b9fada700 (LWP 39085)  0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6
  2 Thread 0x7f8b9f0d9700 (LWP 39087)  0x00000030e0cdf1b3 in poll () from /lib64/libc.so.6
* 1 Thread 0x7f8ba1b51700 (LWP 39066)  0x00000030e0caca3d in nanosleep () from /lib64/libc.so.6

The thread marked with * is the one being examined. If some other thread is active, switch to the main one with the thread 1 command.

Upvotes: 5

Related Questions