Reputation: 981
std::shared_ptr
is guaranteed to be thread-safe. I don't know what mechanism the typical implementations use to ensure this, but surely it must have some overhead. And that overhead would be present even in the case that your application is single-threaded.
Is the above the case? And if so, does that means it violates the principle of "you don't pay for what you don't use", if you aren't using the thread-safety guarantees?
Upvotes: 10
Views: 2297
Reputation: 158539
If we check out cppreference page for std::shared_ptr they state the following in the Implementation notes section:
To satisfy thread safety requirements, the reference counters are typically incremented and decremented using std::atomic::fetch_add with std::memory_order_relaxed.
It is interesting to note an actual implementation, for example the libstdc++ implementation document here says:
For the version of shared_ptr in libstdc++ the compiler and library are fixed, which makes things much simpler: we have an atomic CAS or we don't, see Lock Policy below for details.
The Selecting Lock Policy section says (emphasis mine):
There is a single _Sp_counted_base class, which is a template parameterized on the enum __gnu_cxx::_Lock_policy. The entire family of classes is parameterized on the lock policy, right up to __shared_ptr, __weak_ptr and __enable_shared_from_this. The actual std::shared_ptr class inherits from __shared_ptr with the lock policy parameter selected automatically based on the thread model and platform that libstdc++ is configured for, so that the best available template specialization will be used. This design is necessary because it would not be conforming for shared_ptr to have an extra template parameter, even if it had a default value. The available policies are:
[...]
3._S_Single
This policy uses a non-reentrant add_ref_lock() with no locking. It is used when libstdc++ is built without --enable-threads.
and further says (emphasis mine):
For all three policies, reference count increments and decrements are done via the functions in ext/atomicity.h, which detect if the program is multi-threaded. If only one thread of execution exists in the program then less expensive non-atomic operations are used.
So at least in this implementation you don't pay for what you don't use.
Upvotes: 12
Reputation: 73530
At least in the boost code on i386, boost::shared_ptr
was implemented using an atomic CAS operation. This meant that while it has some overhead, it is quite low. I'd expect any implementation of std::shared_ptr
to be similar.
In tight loops in high performance numerical code I found some speed-ups by switching to raw pointers and being really careful. But for normal code - I wouldn't worry about it.
Upvotes: 4