Reputation: 57615
I'm currently designing a object structure for a game, and the most natural organization in my case became a tree. Being a great fan of smart pointers I use shared_ptr
's exclusively. However, in this case, the children in the tree will need access to it's parent (example -- beings on map need to be able to access map data -- ergo the data of their parents.
The direction of owning is of course that a map owns it's beings, so holds shared pointers to them. To access the map data from within a being we however need a pointer to the parent -- the smart pointer way is to use a reference, ergo a weak_ptr
.
However, I once read that locking a weak_ptr
is a expensive operation -- maybe that's not true anymore -- but considering that the weak_ptr
will be locked very often, I'm concerned that this design is doomed with poor performance.
Hence the question:
What is the performance penalty of locking a weak_ptr? How significant is it?
Upvotes: 34
Views: 14785
Reputation: 16940
Using/dereferencing a shared_ptr is almost like accessing raw ptr, locking a weak_ptr is a perf "heavy" operation compared to regular pointer access, because this code has to be "thread-aware" to work correctly in case if another thread triggers release of the object referenced by the pointer. At minimum, it has to perform some sort of interlocked/atomic operation that by definition is much slower than regular memory access.
As usual, one way to see what's going on is to inspect generated code:
#include <memory>
class Test
{
public:
void test();
};
void callFuncShared(std::shared_ptr<Test>& ptr)
{
if (ptr)
ptr->test();
}
void callFuncWeak(std::weak_ptr<Test>& ptr)
{
if (auto p = ptr.lock())
p->test();
}
void callFuncRaw(Test* ptr)
{
if (ptr)
ptr->test();
}
Accessing through shared_ptr and raw pointer is almost identical. shared_ptr
needs to load the referenced value first requiring one extra load than the raw version.
callFuncShared:
callFuncRaw:
callFuncWeak:
Calling through weak_ptr
produces 10x more code and at best it has to go through locked compare-exchange, which by itself will take more than 10x CPU time than dereferencing raw or shared_ptr:
Only if the shared counter isn't zero, only then it can load the pointer to actual object and use it (by calling the object, or creating a shared_ptr
).
Upvotes: 12
Reputation: 311
For my own project, I was able to improve performance dramatically by adding
#define BOOST_DISABLE_THREADS
before any boost includes.
This avoids the spinlock/mutex overhead of weak_ptr::lock which in my project was
a major bottleneck. As the project is not multithreaded wrt boost, I could do this.
Upvotes: 12
Reputation: 106609
From the Boost 1.42 source code (<boost/shared_ptr/weak_ptr.hpp>
line 155):
shared_ptr<T> lock() const // never throws
{
return shared_ptr<element_type>( *this, boost::detail::sp_nothrow_tag() );
}
ergo, James McNellis's comment is correct; it's the cost of copy-constructing a shared_ptr
.
Upvotes: 20