Reputation: 7015
I'm implementing a custom iterator for a non-STL container type and came across the following behaviour, which at this stage, seems a bit unexpected to me.
It seems that there's a significant performance hit that results when you define an "empty" dtor?? Why??
To try to get to the bottom of this I implemented a simple iterator for std::vector, so that I could compare performance directly with the standard STL iterator. For the sake of a fair test I simply copied a simplified implementation from "vector.hpp" and experimented with adding an extra "empty" dtor:
template <typename _Myvec>
class my_slow_iterator // not inheriting from anything!!
{
public :
_Myvec::pointer _ptr; // pointer to vector element
/* All of the standard stuff - essentially from "vector.hpp" */
/* An additional empty dtor */
~my_slow_iterator () {}
};
I then modified std::vector so that I could make it return my new iterator type and used the following to benchmark - sort a vector of 2000000 random integers, averaged over three runs:
std::vector vec;
// fill via rand();
int tt = clock();
std::sort(vec.begin(), vec.end());
tt = clock() - tt; // elapsed time in ms
I obtained the following results (VS2010, Release build, _ITERATOR_DEBUG_LEVEL 0 etc):
my_slow_iterator
with the empty dtor removed: 560 ms.my_slow_iterator
with the empty dtor included: 900 ms.It appears that an empty dtor is causing a slow-down of approximately 40% in this case.
Obviously, if the dtor is empty then why have it, but I was expecting that simple "empty" functions like this would be inlined and optimised away at compile time. If this isn't the case then I'd like to understand what's going on in case this type of issue has ramifications in more complex cases.
EDIT: compiled with O2 optimisations.
EDIT: digging a bit further, it seems that a similar effect occurs with the copy ctor. Originally (and in the tests above) my_slow_iterator
has no copy-ctor defined, so uses the compiler generated default.
If I define the following copy-ctor (which doesn't do any more than what I expect the compiler generated one would do):
my_slow_iterator (
const my_slow_iterator<_Myvec> &_src
) : _ptr(_src._ptr) {}
I see the following results for the same test as above:
my_slow_iterator
, dtor removed, copy-ctor included: 690msmy_slow_iterator
, dtor included, copy-ctor included: 980msWhich is a further (although not as drastic) performance hit.
Why/how are the compiler default functions so much more efficient?? Do user defined ctor/dtor's implicitly do something in the background??
Upvotes: 5
Views: 1135
Reputation: 249333
I recall experiencing something similar with GCC (-O3) on Linux. The code for a user-defined destructor, despite being empty and in the header file, was emitted, whereas the compiler-generated default destructor yielded no instructions. This perplexed me, and ultimately I made the code work with no explicit destructor (though at the cost of being able to add an assert()
in it, which is why the empty one was desirable--it wasn't empty in debug builds).
Upvotes: 2