c4757p
c4757p

Reputation: 1808

What could make GDB refuse to break?

I'm at a loss here. I'm writing a compiler in C (for hobby), and compiling with GCC 4.6.1 on amd64 Linux 2.6.32, using GDB 7.3. Flags are "-Wall -Wextra -O0 -g", in addition to the usual -I and whatnot. I have a function whose purpose is to report a parse error, defined as follows:

void cerror_at (struct lex *lex, struct token *tok, const char *fmt, ...)

Other than being variadic, nothing weird. The problem is that GDB will NOT break at it. I've tried every way I can think of (breakpoint at the function, inside the function, before it's called, you name it), but as soon as my program is inside the function, I get messages like "warning: Error removing breakpoint 0" and GDB just lets the program finish. There's nothing wrong with it any more (I've since fixed the bug I was trying to find, and everything runs as it should), but I can't get into the function. Any ideas on what could cause this?

Edit: More information! GDB is setting the breakpoint at 0x403057. The function starts at 0x403025. Look at this part of the disassembly:

0x0000000000403053 <+46>:   test   %al,%al
0x0000000000403055 <+48>:   je     0x403077 <cerror_at+82>

At this point, it skips ahead to 0x403077 (past the breakpoint). I've verified that placing the breakpoint at an address before the "je" works, as well as at an address at or after 0x403077, the target of the jump, but not in between (where GDB is trying to place it). Why would GDB place the breakpoint in the middle of the function? Even GDB tells me that the function's address is, in fact, 0x403025.

Upvotes: 5

Views: 5170

Answers (4)

Nemo
Nemo

Reputation: 71515

Based on your latest edit, I bet this is a compiler bug. Specifically, I bet there is a problem with the generation of DWARF debugging information, which is what GDB uses to map between addresses in the object code and files/procedures/line numbers in the source.

You might try experimenting with -gdwarf-4, -gdwarf-3, or even -gdwarf-2 to see if it makes any difference.

If you can reduce this to a simple test case, I suspect the gcc developers would accept it as a bug.

Of course, this could also be a bug in GDB. But given this behavior, I think that is less likely.

Upvotes: 0

Kevin
Kevin

Reputation: 4727

As mentioned in the comments, you should try to understand what machine code was generated by the compile:

in GDB, you can see the actual function code with disassemble cerror_at which should give something like (x86_64):

(gdb) disassemble dive
Dump of assembler code for function dive:
   0x0000000000400504 <+0>: push   %rbp
   0x0000000000400505 <+1>: mov    %rsp,%rbp
   0x0000000000400508 <+4>: sub    $0x10,%rsp
   0x000000000040050c <+8>: mov    %edi,-0x4(%rbp)
   0x000000000040050f <+11>:    mov    -0x4(%rbp),%eax
   0x0000000000400512 <+14>:    add    $0x1,%eax
   0x0000000000400515 <+17>:    mov    %eax,%edi
   0x0000000000400517 <+19>:    callq  0x400504 <dive>
   0x000000000040051c <+24>:    leaveq
   0x000000000040051d <+25>:    retq   
End of assembler dump.

Then check that the subroutine is actually called: break the execution a few instruction before your function call, then:

   (gdb) x/5i $p
=> 0x400539 <main+27>:  mov    $0x1,%esi
   0x40053e <main+32>:  mov    %rax,%rdi
   0x400541 <main+35>:  callq  0x400408 <fwrite@plt>
   0x400546 <main+40>:  mov    $0x1,%edi
-->0x40054b <main+45>:  callq  0x400504 <dive> <------

the output should be slightly different with another CPU architecture, but you should be able see a branching instruction pointing to the address of your function.

--

And you can execute maint info breakpoint just before the moment where you see Error removing breakpoint 0to know where the breakpoint 0 was set, it might help to understand what's wrong with it.

Upvotes: 0

Employed Russian
Employed Russian

Reputation: 213375

This sounds like a bug in GDB. In particular, Error removing breakpoint 0 is very suspicious (it's a breakpoint GDB automatically inserted somewhere; user-inserted breakpoints have positive numbers).

You should probably try to create a reduced test case and file a bug here.

Upvotes: 1

Jared
Jared

Reputation: 1897

Maybe I'm just dense, but the most common reasons I've come across for a debugger to refuse to break are the most simple ones.

  1. The code that I'm trying to debug doesn't exactly match the code in the debugger.
  2. I forgot to compile that library with debugging options.

Be sure you check both of these anytime you get a debugging error like this, even if you think it's not the problem.

Upvotes: 3

Related Questions