Reputation: 774

What could cause this code to segfault

In the following code, I'm seeing a segfault at the line that says signaling_thread_->Send(this, id, data);, which is being called from the destructor of the PeerConnectionProxy class.

bool PeerConnectionProxy::Send(uint32 id, talk_base::MessageData* data) {
  if (!signaling_thread_)
    return false;
  signaling_thread_->Send(this, id, data);
  return true;
}

Running in gdb, I get the segfault and this stack trace as soon as I do (gdb) step to that line:

Program received signal SIGSEGV, Segmentation fault.

0x00000000 in ?? ()
(gdb) bt
#0  0x00000000 in ?? ()
#1  0xa782eed4 in webrtc::PeerConnectionProxy::Send (this=0xab889e80, id=6, data=0xbfffc1e8)
    at third_party/libjingle/source/talk/app/webrtc/peerconnectionproxy.cc:219
#2  0xa782e91a in ~PeerConnectionProxy (this=0xab889e80, __in_chrg=<value optimised out>)
    at third_party/libjingle/source/talk/app/webrtc/peerconnectionproxy.cc:145

...

Breaking just before that line, I check that, as expected, signaling_thread_ is non-null, as is this and data. I'm just quite confused as to what could be causing a segfault there or making the stack end up at 0x00000000. The code only segfaults on the code path through the destructor. The Send function is called from numerous other places with no problem.

Update 2011-12-08:

Stepping through with stepi and disassembly turned on, I get this:

0xa772eed2  219   signaling_thread_->Send(this, id, data);
   0xa772eea4 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+24>:     8b 45 08   mov    0x8(%ebp),%eax
   0xa772eea7 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+27>:     8b 40 0c   mov    0xc(%eax),%eax
   0xa772eeaa <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+30>:     8b 00  mov    (%eax),%eax
   0xa772eeac <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+32>:     83 c0 40   add    $0x40,%eax
   0xa772eeaf <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+35>:     8b 08  mov    (%eax),%ecx
   0xa772eeb1 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+37>:     8b 45 08   mov    0x8(%ebp),%eax
   0xa772eeb4 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+40>:     8d 70 04   lea    0x4(%eax),%esi
   0xa772eeb7 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+43>:     8b 45 08   mov    0x8(%ebp),%eax
   0xa772eeba <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+46>:     8b 40 0c   mov    0xc(%eax),%eax
   0xa772eebd <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+49>:     8b 55 10   mov    0x10(%ebp),%edx
   0xa772eec0 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+52>:     89 54 24 0c    mov    %edx,0xc(%esp)
   0xa772eec4 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+56>:     8b 55 0c   mov    0xc(%ebp),%edx
   0xa772eec7 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+59>:     89 54 24 08    mov    %edx,0x8(%esp)
   0xa772eecb <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+63>:     89 74 24 04    mov    %esi,0x4(%esp)
   0xa772eecf <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+67>:     89 04 24   mov    %eax,(%esp)
=> 0xa772eed2 <_ZN6webrtc19PeerConnectionProxy4SendEjPN9talk_base11MessageDataE+70>:     ff d1  call   *%ecx

ecx is 0x0 so that's what's making it segfault but I still don't understand what's going on. The other code for the line doesn't look to touch ecx, unless I'm just reading it wrong.

Upvotes: 3

Answers (3)

Karl Bielefeldt

Reputation: 49118

%ecx holds the virtual table entry for the Send method, which has been zeroed out for some reason. Most commonly with a destructor is that signaling_thread_ is being deleted before the call to PeerConnectionProxy::Send. Another possibility is a buffer overrun in a previous call to a signaling_thread_ method, which is overwriting the virtual table entry. Another possibility is a buffer overrun in the destructor that's overwriting the signaling_thread_ pointer. If you post the code to your destructor we might be able to narrow it down.

Upvotes: 1

Chris Dodd

Reputation: 126418

Most likely cause is that signaling_thread_ is a dangling pointer -- it used to point at something, but that something has been delete'd, leaving a pointer that isn't null, but will likely cause a crash if you try to do anything with it (such as calling the Send method on it).

Since you say this is being called from the destructor, its quite possible that the delete call occured earlier in the same destructor...

Upvotes: 3

Aaron McDaid

Reputation: 27153

Whenever I have a segfault that involves destructors, it is often solved by making the destructors virtual. Don't worry if this doesn't make sense to you yet.

When an object is to be destroyed, first the destructor will be called, then an attempt will be made to free the object. Often, the destructor itself is entirely successful, but the attempt to free memory fails due to an obscure issue which I will attempt to describe later. (This answer references the relevant part of the C++ standard.)

Are you sure that the segfault happens during the destructor? Or perhaps it happens immediately after the destructor completes. Can you put in a printf at the end of the relevant destructor, please?

I'm going to assume that the destructor function itself is successful and that the error happens immediately after the destructor, during the attempt to free the memory.

consider this struct:

struct A {
    int x;
};
A a;

Here, clearly &a == &(a.x)

And B inherits from A:

struct B : public A {
};
B b;

Again, &b == &(b.x)

But if virtual methods are involved, things get tricky.

struct C : public B {
   virtual void foo() {}
};
C c;

Now, &c != &(c.x). This is because the first true entry of c is (compiler-dependant) actually a vtable which lists the location of functions such as foo(). Now imagine the following code:

{
    A * p = new C;
    delete p;
}

The statement delete p thinks it is dealing with an object of type A, but it's actually dealing with an object of type C. The destructor will operate correctly, but the attempt to call free(p) will be wrong because it's not using the correct address. It's like int *p = malloc(100); free(p+1).

If in doubt, put a virtual destructor in every class if you might ever inherit from it with virtual functions in the subclass.

Upvotes: 1

What could cause this code to segfault

Answers (3)

Related Questions