Dr.W
Dr.W

Reputation: 71

Both GDB and LLDB failing to reliably execute breakpoint commands in simple C file

As part of a research project, I am trying to write a gdb command file that outputs certain information on every line of code in arbitrary C source files until the program terminates. This seems easily accomplished with a while loop, outputting whatever data I want within the loop, and then calling "next" at the end of the loop. (I know I would want "step" to enter function calls; I'm not concerned about that at the moment.)

However, in addition to the data I output on every line, I also want to execute special commands at certain breakpoints. This seems easily accomplished with "command". However, I'm encountering a problem where the while loop and breakpoint commands won't both work.

Here is the extremely simple C file I'm working with for testing purposes:

int global;

int main() {
  int x;
  x=-1;
  global = 5;
  return(0);
}

I compile it with gcc -g -o simple simple.c. Then I run gdb -x commands.txt. If the contents of commands.txt are the following:

set confirm off

exec-file simple
file simple

set logging file gdb_output.txt
set logging on
set pagination off

#Special commands I want to execute on certain breakpoints
break 5
command
  echo COMMAND 1 ACTIVATED\n
end

break 6
command
  echo COMMAND 2 ACTIVATED\n
end

break 7
command
  echo COMMAND 3 ACTIVATED\n
end

run

next
next
next
continue

quit

...then the contents of gdb_output.txt are the following, as expected:

Breakpoint 1 at 0x4004da: file simple.c, line 5.
Breakpoint 2 at 0x4004e1: file simple.c, line 6.
Breakpoint 3 at 0x4004eb: file simple.c, line 7.

Breakpoint 1, main () at simple.c:5
5     x=-1;
COMMAND 1 ACTIVATED

Breakpoint 2, main () at simple.c:6
6     global = 5;
COMMAND 2 ACTIVATED

Breakpoint 3, main () at simple.c:7
7     return(0);
COMMAND 3 ACTIVATED
8   }
[Inferior 1 (process 29631) exited normally]

However, if I edit the command file to try to execute as a loop, replacing

next
next
next
continue

with

while true
  next
end

but leaving the rest of the script exactly the same, then the commands I specified for the breakpoints on lines 6&7 never execute, as evidenced by the contents of gdb_output.txt after running the modified command file:

Breakpoint 1 at 0x4004da: file simple.c, line 5.
Breakpoint 2 at 0x4004e1: file simple.c, line 6.
Breakpoint 3 at 0x4004eb: file simple.c, line 7.

Breakpoint 1, main () at simple.c:5
5     x=-1;
COMMAND 1 ACTIVATED

Breakpoint 2, main () at simple.c:6
6     global = 5;

Breakpoint 3, main () at simple.c:7
7     return(0);
8   }
__libc_start_main (main=0x4004d6 <main()>, argc=1, argv=0x7fffffffe128, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe118) at ../csu/libc-start.c:325
325 ../csu/libc-start.c: No such file or directory.
[Inferior 1 (process 29652) exited normally]
commands.txt:30: Error in sourced command file:
The program is not being run.

I know that the loop in its current form is problematic in that it will just keep calling "next" until the program terminates (so it never reaches "quit" at the bottom of the script), but that doesn't seem like it should stop the breakpoint commands from being run -- yet that is what appears to be happening. (If the breakpoint commands were being executed, I could condition my while loop to terminate once it hit breakpoints set before the C program's exit points.)

Is this a bug in GDB, or am I misunderstanding something? If this construction fundamentally won't work, then is there a way to execute a canned series of GDB commands on every step of a program run until the program terminates, while also executing commands specified at certain breakpoints -- or is this fundamentally impossible with a GDB script?

(My gdb version is 7.11.1 and if it matters, my OS is Linux.)


UPDATE

I decided to give lldb a shot and ran into some more perplexing issues (using the same C file as above, compiled with the same command). Here is my lldb script:

target create --no-dependents --arch x86_64 simple

breakpoint set --file simple.c --line 5
breakpoint command add
  script print "COMMAND 1 ACTIVATED"
DONE

breakpoint set --file simple.c --line 6
breakpoint command add
  script print "COMMAND 2 ACTIVATED"
DONE

breakpoint set --file simple.c --line 7
breakpoint command add
  script print "COMMAND 3 ACTIVATED"
DONE

run

frame variable x
continue

frame variable x
continue

frame variable x
continue

quit

This is exhibiting rather strange behavior. The above version hits the first breakpoint, executes the associated command, then ignores all the following breakpoints. If I comment out just the second breakpoint, its associated command, and the corresponding frame variable x, continue, then breakpoints 1 and 3 both get hit and their corresponding commands are executed. Commenting out only the 1st or 3rd breakpoint and its associated command and frame variable x, continue results in just the first uncommented breakpoint getting hit, and its associated command run. In short, it appears that having breakpoints on two consecutive lines of code causes all breakpoints after the first to be ignored.

Does anyone know what is going on here? Is there a way I can have a breakpoint on every line and have them all get hit? And is this problem in any way related to the gdb issues described above?

Upvotes: 5

Views: 637

Answers (2)

Simon Kissane
Simon Kissane

Reputation: 5258

For GDB, the answer is in the source code

This is a limitation of GDB's design. I can't find it documented anywhere, but it becomes clear if you study the source code. One comment (in GDB 12.1, breakpoint.c line 4485) is particularly instructive:

        /* In sync mode, when execute_control_command returns
           we're already standing on the next breakpoint.
           Breakpoint commands for that stop were not run, since
           execute_command does not run breakpoint commands --
           only command_line_handler does, but that one is not
           involved in execution of breakpoint commands.  So, we
           can now execute breakpoint commands.  It should be
           noted that making execute_command do bpstat actions is
           not an option -- in this case we'll have recursive
           invocation of bpstat for each breakpoint with a
           command, and can easily blow up GDB stack.  Instead, we
           return true, which will trigger the caller to recall us
           with the new stop_bpstat.  */

In other words, execute_control_command (cli-script.c line 697) and execute_command (top.c line 588) C++ functions do not run breakpoint commands. Only command_line_handler (event-top.c line 757 function does. command_line_handler, as the name suggests, runs direct user input from the CLI (sourced files included). It calls the command_handler (event-top.c line 582) function to execute the input line. command_handler executes the command and then executes breakpoint commands:

      execute_command (command, ui->instream == ui->stdin_stream);

      /* Do any commands attached to breakpoint we stopped at.  */
      bpstat_do_actions ();

Whereas, the subcommands executed inside a while are fed into execute_control_command_1 (cli-script.c line 512), which in turn calls execute_command. So the while command never calls bpstat_do_actions (breakpoint.c line 4521), hence no breakpoint commands are executed inside a while loop.

Why did they do this?

I don't know why they chose this design, I can only speculate. I believe GDB has worked like this for a very long time. But here are some plausible reasons:

  • they wanted to avoid any possibility of infinite recursion and a stack overflow crash by a breakpoint command triggering a breakpoint which then runs a breakpoint command which triggers a breakpoint, etc
  • in a compound sequence of commands (such as the body of a while loop), running breakpoint commands may do something unexpected which breaks the subsequent commands in that sequence. Delaying breakpoint command execution until after the current command sequence completes may have been viewed as safer
  • GDB has some history of bugs related to interactions between different command sequences, and delaying breakpoint command execution may have helped avoid some of those bugs or make them easier to fix

A workaround

We need to find a command which calls bpstat_do_actions to run from within the while loop. Is there any? Well, if you look at the C++ source code of the Python gdb.execute() (execute_gdb_command), you will see it does call bpstat_do_actions (python.c line 679). Which suggests, replacing the call to next inside your while loop with python gdb.execute("next").

I try the test case in your question. I get the same results as you, except that (for whatever reason) while true gave me the error No symbol "true" in current context, and instead I had to do while 1.

So then I try:

while 1
  python gdb.execute("next")
end

And it works as you expect:

Breakpoint 1, main () at simple.c:5
5         x=-1;
COMMAND 1 ACTIVATED

Breakpoint 2, main () at simple.c:6
6         global = 5;
COMMAND 2 ACTIVATED

Breakpoint 3, main () at simple.c:7
7         return(0);
COMMAND 3 ACTIVATED
8       }
__libc_start_call_main (main=main@entry=0x555555555129 <main>, argc=argc@entry=1, argv=argv@entry=0x7fffffffe3b8) at ../sysdeps/nptl/libc_start_call_main.h:74
74      ../sysdeps/nptl/libc_start_call_main.h: No such file or directory.
[Inferior 1 (process 1359017) exited normally]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
gdb.error: The program is not being run.
commands.txt:30: Error in sourced command file:
Error while executing Python code.

Instead of while 1, we want something like while $is_running, but I can't find any convenience variable/function like that defined by default. A solution to that issue is outside of the scope of this question, but I provide it in my answer to another question

Another solution, which is equivalent, doesn't run the next command via Python, but still uses Python to trigger the execution of pending breakpoint commands:

while 1
  next
  python gdb.execute("")
end

It turns out that gdb.execute() still runs pending breakpoint commands even if just given an empty string as the command to run.

You can define a user-defined command run-bp-commands:

define run-bp-commands
  python gdb.execute("")
end
document run-bp-commands
Run pending breakpoint commands now
end

And then the loop becomes:

while 1
  next
  run-bp-commands
end

Which is superior in that it hides the hacky detail that python gdb.execute("") runs the pending breakpoint commands.

I think it would be great if GDB got a built-in command like run-bp-commands, because relying on hacky details like this is ugly. On the other hand, given the failure to run breakpoint commands in a while loop is quite possibly intentional (to prevent possible bugs), they might not want to encourage people to do this by providing such a command built-in.

Some final asides

  • An oddity that confused me at first, is *.c files in the GDB source code actually contain C++. Usually, you would expect .c files to contain plain C, with C++ source files having a C++-specific extension such as .cc, .cpp, .cxx, etc. I assume at some point they decided to switch from C to C++, but decided it was easier not to rename all the source files.
  • I'm referring here to GDB 12.1 source, since that's the version I'm currently using, I haven't upgraded to GDB 13.1 yet. I had a look at the source code of GDB 13.1 – it includes a significant number of improvements in breakpoint handling compared to 12.1 (including fixing various bugs including memory corruption issues). It doesn't look like any of those changes change this, but I can't say for sure since I haven't actually tried running GDB 13.1 yet.
  • Note I am linking to Beren Minor's unofficial mirror of the GDB source code on GitHub, not the official repository. I find GitHub easier to navigate than the official repo
  • No idea about your problems with LLDB, I rarely use it. Hopefully, knowing the cause of the issue with GDB, may give some clue as to the cause of the problems with LLDB – it may well have copied some of GDB's design choices. And just like with GDB here, study of its source code is likely to reveal the answer.

Upvotes: 0

Dr.W
Dr.W

Reputation: 71

I still haven't figured out why gdb and lldb were acting the way they were, but I did devise an alternative approach to accomplish what I want. I wrote a script to communicate with lldb using two named pipes whereby the script's stdout is linked to lldb's stdin and vice-versa, so the script can send lldb commands (frame variable -L, bt, step, etc.) then get lldb's output and parse it. The script can of course loop all it wants, so this bypasses the problem where I couldn't get gdb or lldb command files to loop properly.

Upvotes: 1

Related Questions