Malvineous
Malvineous

Reputation: 27340

Finding where a shared_ptr's reference count is incremented

I have some code which has memory leaks as it is getting cycling references among its shared_ptr instances (this is where two shared_ptr instances point to objects which each have an internal shared_ptr reference to the other class instance. This means neither class will ever get destroyed, as each class instance is still in use by the other one, causing a memory leak. In some cases it is a single shared_ptr instance of a class that references itself, also.)

Running the code through Valgrind is helpful as it tells me where the memory was originally allocated, however this is not where the cyclic reference originates. I need to find all the places that a specific shared pointer (the one Valgrind complains about) has had its reference count incremented, as one of those will have to be changed to a weak_ptr to solve the problem.

How can I select a specific shared_ptr and get a list of all source lines where its reference count was incremented?

I'm running under Linux with GCC/GDB and Valgrind, but platform-neutral solutions would be welcomed.

Here is some sample code to demonstrate the problem:

#include <boost/shared_ptr.hpp>

struct Base {
    int i;
};
struct A: public Base {
    int a;
    boost::shared_ptr<Base> ptrInA;
};
struct B: public Base {
    int b;
    boost::shared_ptr<Base> ptrInB;
};

int main(void)
{
    boost::shared_ptr<A> a(new A);   // Line 17
    boost::shared_ptr<B> b(new B);
    a->ptrInA = b;                   // Line 19
    b->ptrInB = a;
    return 0;
}

When run under Valgrind, it says:

HEAP SUMMARY:
    in use at exit: 96 bytes in 4 blocks
  total heap usage: 4 allocs, 0 frees, 96 bytes allocated

96 (24 direct, 72 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
   at 0x4C2A4F0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x40099A: main (test.cpp:17)

LEAK SUMMARY:
   definitely lost: 24 bytes in 1 blocks
   indirectly lost: 72 bytes in 3 blocks

I'm looking for a solution that would point me to lines 19-20 in the source file as possible causes of the cycle, so I can examine the code and make a decision about whether it needs to be changed.

Upvotes: 6

Views: 3081

Answers (3)

Michal Fapso
Michal Fapso

Reputation: 1322

Based on @dandan78's approach. Here is a more detailed example of GDB CLI, which creates a breakpoint on shared_ptr's reference-count changes.

main.cpp:

#include <iostream>
#include <memory>

using namespace std;

#define DBG(msg) std::cout << msg << std::endl;

class A {
    public:
        A(int i) {
            mI = i;
            DBG("A() this:"<<this<<" i:"<<mI);
        }
        ~A() {
            DBG("~A() this:"<<this<<" i:"<<mI);
        }
    private:
        int mI = 0;
};

int main() {
    std::shared_ptr<A> p1(new A(0x12345678));
    DBG("p1 use_count:"<<p1.use_count());
    {
        auto p2 = p1;
        DBG("p1 use_count:"<<p1.use_count());
        DBG("p2 use_count:"<<p2.use_count());
        auto p3 = p1;
        DBG("p1 use_count:"<<p1.use_count());
        DBG("p2 use_count:"<<p2.use_count());
        DBG("p3 use_count:"<<p3.use_count());
    }
    DBG("p1 use_count:"<<p1.use_count());
    return 0;
}

Makefile:

CXXFLAGS = -O0 -ggdb

main: main.cpp
    $(CXX) $(CXXFLAGS) -o $@ $<

Program's output:

A() this:0x6c6fb0 i:305419896
p1 use_count:1
p1 use_count:2
p2 use_count:2
p1 use_count:3
p2 use_count:3
p3 use_count:3
p1 use_count:1
~A() this:0x6c6fb0 i:305419896

Compile and run gdb (don't paste the # comments to gdb):

make
gdb main 2>&1 | tee out.log

GDB session:

(gdb) b main.cpp:23   # right after the p1 initialization
(gdb) r
Thread 1 hit Breakpoint 1, main () at main.cpp:23
(gdb) x/2xg &p1
0x62fe00:       0x0000000000fd4a10      0x0000000000fd4a50
# First pointer points to the target A object, sencond points to the reference counter
# Inspect the refcount data:
(gdb) x/4xw 0x0000000000fd4a50
0xfd4a50:       0x00405670      0x00000000      0x00000003      0x00000001
# The third integer is use_count of the shared_ptr, which can be printed by:
(gdb) x/1xw 0x0000000000fd4a50 + 8
0xfd4a58:       0x00000001

# Add a watchpoint for the use_count address
(gdb) watch *(int*)(0x0000000000fd4a50 + 8)
Hardware watchpoint 2: *(int*)(0x0000000000fd4a50 + 8)
# Add commands for the new watchpoint 2:
(gdb) commands 2
bt             # backtrace
c              # continue
end            # end of the handler script

(gdb) c        # Continue the program

Now you can inspect the out.log file and analyze all backtraces where the use_count changed.

The gdb watchpoint can be added also directly:

watch *(*((int**)(&p1) + 1) + 2)
                   ^--------------- the shared_ptr variable
                         ^--------- +1 pointer to the right (+8 bytes in 64bit programm)
                              ^---- +2 integers to the right (+8 bytes)

If you compile with optimizations, the shared_ptr variable could have been optimized out. Just print it directly in your code then to get the address of the shared_ptr object and paste it in your gdb session:

std::cout << "p1:" << (void*)&p1 << std::endl;

Upvotes: 4

dandan78
dandan78

Reputation: 13894

While Yochai Timmer's approach is fine for smaller projects, I recently had to work on a rather large codebase that used shared_ptr, the Boost variant, all over the place. This was on Windows and the UI was done on top of MFC, with the main CWinApp-derived application class pointed to by a shared_ptr. And it was never getting destructed, resulting in a whole slew of destructors not getting called and some nasty behavior as a result.

After trying a variety of leak detectors, I solved the problem by getting the debugger to break on the first line where the offending shared_ptr was accessed, then searching the relevant headers until I found exactly where the reference counter was. I then added a memory breakpoint to the address of the reference counter and had the VS debugger break on every increment/decrement until the value of the ref ctr failed to drop back to its 'normal' value.

In my case I knew this shared_ptr should not have a ref count > 2 after the application's initialization was completed. When it hit 3 and never dropped back to 2 again, I knew I'd found the leak. And the memory breakpoint only needed to be hit about 1000 times or so...

Yes, I'm sure there are better ways of tracking down memory leaks involving shared_ptr but if all else fails, there's always the brute force approach of watching the reference counter. The details will, of course, depend on your shared_ptr implementation and how your application is organized.

Upvotes: 2

Yochai Timmer
Yochai Timmer

Reputation: 49261

You have a design error. And as such, you need to use design debugging tools.

Get a Pen and a piece of Paper.
Draw a rectangle for each class type. Draw an arrow from every class that holds a shared_ptr to the class it holds.
If you find a circle, that's your problem.

Now, each arrow, or link, is created somewhere via shared_ptr assignment.
Look at the suspicious arrows, those that close the circle, and see if they are released properly.

Upvotes: 0

Related Questions