Reputation: 199
I have some simulation code I'm working on and I've just gotten rid of all the low hanging fruit so far as optimisation is concerned. The code now spends half its time pushing back vectors. (The size of the final vectors is known and I reserve appropriately)
Essentially I'm rearranging one vector into a permutation of another, or populating the vector with random elements.
Is there any faster means of pushing back into a vector? Or pushing back/copying multiple elements?
std::vector<unsigned int, std::allocator<unsigned int> >::push_back(unsigned int const&)
Thanks in advance.
EDIT: Extra info; I'm running a release build with -O3, also: the original vector needs to be preserved.
Upvotes: 4
Views: 2858
Reputation: 71535
push_back
on int
is extremely efficient. So I would look elsewhere for optimization opportunities.
Nemo's first rule of micro-optimization: Math is fast; memory is slow. Creating a huge vector is very cache-unfriendly.
For example, instead of creating a permutation of the original vector, can you just compute which element you need as you need it and then access that element directly from the original vector?
Similarly, do you really need a vector of random integers? Why not just generate a random number when it is needed? (If you have to remember it for later, then go ahead and push it onto the vector then... But not before.)
push_back
on int
is about as fast as it is going to get. I would bet you could barely notice the difference even if you got rid of the reserve
(because the re-allocation does not happen often and is going to use a very fast bulk copy already). So you need to take a broader view to improve performance.
Upvotes: 2
Reputation: 393134
You can have a look at
move semantics
)std::vector
--> eastl::vector
and some extra link objects).I'd personally not expect much gain if really the vector handling is the bottleneck. I'd really look at parallelizing with (in order of preference):
CPPFLAGS+=-D_GLIBCXX_PARALLEL -fopenmp
)#pragma parallel for
I must be forgetting stuff. O yeah, look here: http://www.agner.org/optimize/
Edit: I always forget the simplest things: Use memcpy/memmove for bulk appending POD elements to pre-allocated vectors.
Upvotes: 4
Reputation: 146940
If you have multiple vectors, you may be able to improve speed by allocating them continuously using a custom allocator. Improving memory locality may well improve the running time of the algorithm.
Upvotes: 1
Reputation: 385194
If you're pre-reserving space then your vector is as fast as an array. You cannot mathematically make it faster; stop worrying and move on to something else!
You may be experiencing slow-down if you're running a "debug build" i.e. where your standard library implementation has optimisations turned off, and debug tracking info turned on.
Upvotes: 3
Reputation: 2363
If you are using Debug version of STL there is a debug overhead (esp. in iterators) in all STL calls.
I would advice to replace STL vector with regular array. If you are using trivially-copyable types you could easily copy multiple elements using memcpy
call.
Upvotes: -3