Reputation: 23
I need to store sorted elements contiguously in memory, so I thought about std::vector and boost::flat_set. I've tried both, and I checked their performance, and althought back insertion is a little faster with std::vector, front insertion is way faster with boost::flat_set. Here's my test code :
#include <iostream>
#include <vector>
#include <boost/container/flat_set.hpp>
#include <boost/chrono.hpp>
#include <windows.h>
// Just a basic movable object for my tests
class Component
{
public :
Component( void ) :
id( 0 ),
name( "default" ),
data( 100 )
{
}
Component( uint32_t id ) :
id( id ),
name( "default" ),
data( 100 )
{
}
Component( Component&& component ) throw() :
id( std::move( component.id ) ),
name( std::move( component.name ) ),
data( std::move( component.data ) )
{
}
Component& operator=( Component&& component ) throw()
{
id = std::move( component.id );
name = std::move( component.name );
data = std::move( component.data );
return ( *this );
}
uint32_t get_id( void ) const
{
return ( id );
}
private :
uint32_t id;
std::string name;
std::vector< uint32_t > data;
};
// This object can be sorted
inline bool operator<( const Component& component1, const Component& component2 )
{
return ( component1.get_id() < component2.get_id() );
}
#define COMP_NB 1000000
int main( void )
{
/*******************************/
/* Test vector insertion speed */
/*******************************/
std::vector< Component > vector;
vector.reserve( COMP_NB + 1 );
std::cout << "Push back components in the vector: ";
auto startTime = boost::chrono::steady_clock::now();
// Back insertion
for ( uint32_t i = 0; i < COMP_NB; ++i )
{
vector.push_back( Component( i + 1 ) );
}
auto thisTime = boost::chrono::steady_clock::now();
std::cout << boost::chrono::duration_cast< boost::chrono::milliseconds >( thisTime - startTime ).count() << "ms" << std::endl;
std::cout << "Insert one component at the beginning of the vector: ";
startTime = boost::chrono::steady_clock::now();
// Front insertion (all components are shifted)
vector.insert( vector.begin(), Component( 0 ) );
thisTime = boost::chrono::steady_clock::now();
std::cout << boost::chrono::duration_cast< boost::chrono::milliseconds >( thisTime - startTime ).count() << "ms" << std::endl;
/*********************************/
/* Test flat_set insertion speed */
/*********************************/
boost::container::flat_set< Component > flat_set;
flat_set.reserve( COMP_NB + 1 );
std::cout << "Push back components in the flat_set: ";
startTime = boost::chrono::steady_clock::now();
// Back insertion
for ( uint32_t i = 0; i < COMP_NB; ++i )
{
flat_set.insert( Component( i + 1 ) );
}
thisTime = boost::chrono::steady_clock::now();
std::cout << boost::chrono::duration_cast< boost::chrono::milliseconds >( thisTime - startTime ).count() << "ms" << std::endl;
std::cout << "Insert one component at the beginning of the flat_set: ";
startTime = boost::chrono::steady_clock::now();
// Front insertion (all components are shifted)
flat_set.insert( Component( 0 ) );
thisTime = boost::chrono::steady_clock::now();
std::cout << boost::chrono::duration_cast< boost::chrono::milliseconds >( thisTime - startTime ).count() << "ms" << std::endl;
system( "PAUSE" );
return ( 0 );
}
And the output :
Push back components in the vector: 852ms
Insert one component at the beginning of the vector: 59ms
Push back components in the flat_set: 912ms
Insert one component at the beginning of the flat_set: 16ms
Front insertion is 3.6x faster with flat_set! So I ran another test, cause I wanted to see if my move functions were used, and I found something strange. Here's the new code :
#include <iostream>
#include <vector>
#include <boost/container/flat_set.hpp>
#include <boost/chrono.hpp>
#include <windows.h>
// Just a basic movable object for my tests
class Component
{
public :
Component( void ) :
id( 0 ),
name( "default" ),
data( 100 )
{
std::cout << "Default constructor" << std::endl;
}
Component( uint32_t id ) :
id( id ),
name( "default" ),
data( 100 )
{
std::cout << "Custom constructor" << std::endl;
}
Component( Component&& component ) throw() :
id( std::move( component.id ) ),
name( std::move( component.name ) ),
data( std::move( component.data ) )
{
std::cout << "Move constructor" << std::endl;
}
Component& operator=( Component&& component ) throw()
{
std::cout << "Move assignment operator" << std::endl;
id = std::move( component.id );
name = std::move( component.name );
data = std::move( component.data );
return ( *this );
}
uint32_t get_id( void ) const
{
return ( id );
}
private :
uint32_t id;
std::string name;
std::vector< uint32_t > data;
};
// This object can be sorted
inline bool operator<( const Component& component1, const Component& component2 )
{
return ( component1.get_id() < component2.get_id() );
}
#define COMP_NB 1
int main( void )
{
/*******************************/
/* Test vector insertion speed */
/*******************************/
std::vector< Component > vector;
vector.reserve( COMP_NB + 1 );
std::cout << "-- Push back one component in the vector: " << std::endl;
// Back insertion
for ( uint32_t i = 0; i < COMP_NB; ++i )
{
vector.push_back( Component( i + 1 ) );
}
std::cout << "-- Insert one component at the beginning of the vector: " << std::endl;
// Front insertion (the other component is shifted)
vector.insert( vector.begin(), Component( 0 ) );
std::cout << std::endl;
/*********************************/
/* Test flat_set insertion speed */
/*********************************/
boost::container::flat_set< Component > flat_set;
flat_set.reserve( COMP_NB + 1 );
std::cout << "-- Push back one component in the flat_set: " << std::endl;
// Back insertion
for ( uint32_t i = 0; i < COMP_NB; ++i )
{
flat_set.insert( Component( i + 1 ) );
}
std::cout << "-- Insert one component at the beginning of the flat_set: " << std::endl;
// Front insertion (the other component is shifted)
flat_set.insert( Component( 0 ) );
system( "PAUSE" );
return ( 0 );
}
The new output :
-- Push back one component in the vector:
Custom constructor
Move constructor
-- Insert one component at the beginning of the vector:
Custom constructor
Move constructor
Move constructor
Move assignment operator
Move assignment operator
-- Push back one component in the flat_set:
Custom constructor
Move constructor
-- Insert one component at the beginning of the flat_set:
Custom constructor
Move constructor
Move assignment operator
There's something strange here. Flat_set behaviour seems normal :
-- Insert one component at the beginning of the flat_set:
Custom constructor //Ok, I asked the creation of a new component
Move constructor //Ok, the flat_set need to shift the previous component in a new position
Move assignment operator //OK, the new component need to be moved to the front of the flat_set
But what about the vector?
-- Insert one component at the beginning of the vector:
Custom constructor //Ok, I asked the creation of a new component
Move constructor //Ok, the vector need to shift the previous component in a new position
Move constructor //Wait... what?
Move assignment operator //OK, the new component need to be moved to the front of the vector
Move assignment operator //Wtf?
I tried with more components, and the vector behaviour is the same : it keep doing all move operations twice. Why? Can it be avoided? If not, should I stick to flat_set (I'll sometimes need to shift my data)?
[Edit] : Here's the output with 10 components being inserted in back and 1 being inserted in front, and with id of components being constructed or moved : http://pastebin.com/SzT5M8yP
[Edit 2] : I did the same test with boost::container::vector, and the output (and the speed!) is identical to flag_set. Is it a problem with the Microsoft Implementation of vector? O.o
[Edit 3] : Problem submitted to Microsoft : https://connect.microsoft.com/VisualStudio/feedback/details/801205/std-vector-do-extra-operations-when-shifting-elements
Upvotes: 0
Views: 738
Reputation: 23
3 years later, this is what I get in my mailbox :
Greetings from Microsoft Connect!
This notification was generated for feedback item: std::vector do extra operations when shifting elements which you submitted at theMicrosoft Connect site.
Hi,
Thanks for reporting this bug. I've fixed it by thoroughly overhauling vector for correctness and performance, and this fix will be available in VS "15" RC. Previously, insertion/emplacement called rotate() in certain situations. Now we sequence our actions more intelligently, avoiding the need to rotate elements around.
Stephan T. Lavavej Principal Software Engineer, Visual C++ Libraries [email protected]
You may receive a general "Feedback Item Updated" notification as well, if any other changes were made by Microsoft.
Thank you for using Microsoft Connect!
Regards,
the Microsoft Connect Team
Please do not reply directly to this message, as it is generated from an unmonitored email account. If you have comments related to your Feedback, please enter it in the Comments section (post a comment to Microsoft) of your Feedback item by navigating to the Feedback item in the link above.
If you are having trouble accessing the Feedback link above, please go to the http://connect.microsoft.com/help/default.aspx page to report the issue, In your submission, please make sure to paste a copy the link above into the report.
Hurrah! It will be fixed soon. :D Granted it's a bit late, but it's still a good thing.
Upvotes: 0
Reputation: 171253
It's not doing all move operations twice, if your vector had more than one element in it you would see that only some operations happen twice.
To insert an rvalue at the beginning of a vector of N elements (with sufficient capacity) a typical approach is:
new (_M_data+N) T(_M_data[N-1]);
_M_size += 1;
for (int i = N-1; i > 0; --i)
_M_data[i] = std::move(_M_data[i-1]);
_M_data[0] = T(std::move(arg));
This means you will see one move construction, followed by (N-1) move assignments (in your case the vector only has one element so you see nothing in step 2) and then a move construction and a move assignment.
In step 3 the temporary is constructed because the same insertion logic is used for emplace
as insert
, so it actually does T(std::move(args)...)
with zero or more args, in the case where there is just one argument of type T
it would have been possible to just do _M_data[0] = std::move(arg);
and avoid a move construction.
I'm not sure why you see an extra move assignment at the end, GCC's standard library implementation doesn't do that and I'm not sure why it's needed. You could quite easily modify your code to print the identity of the objects being moved constructed/assigned, to see which elements are being moved twice, that might shed some more light on it.
Upvotes: 2
Reputation: 59553
My guess is that std::vector
is growing the size of the vector by allocating a new vector, copying the current singular value into the new vector, and then inserting the new value. Try removing the call to reserve
so that the memory management of std::vector
work for you.
Upvotes: -1