Reputation: 311
I have a float vector. As I process certain data, I push it back.I always know what the size will be while declaring the vector.
For the largest case, it is 172,490,752 floats. This takes about eleven seconds just to push_back everything.
Is there a faster alternative, like a different data structure or something?
Upvotes: 4
Views: 8418
Reputation: 31474
If you know the final size, then reserve() that size after you declare the vector. That way it only has to allocate memory once.
Also, you may experiment with using emplace_back() although I doubt it will make any difference for a vector of float
. But try it and benchmark it (with an optimized build of course - you are using an optimized build - right?).
Another option when you know the size in advance is to use std::array
.
Upvotes: 11
Reputation: 5421
I have two answers for you:
reserve
to allocate the storage beforehand can be quite helpful, but:push_back
(or emplace_back
) themselves have a performance penalty because during every call, they have to check whether the vector has to be reallocated. If you know the number of elements you will insert already, you can avoid this penalty by directly setting the elements using the access operator []
So the most efficient way I would recommend is:
Initialize the vector with the 'fill'-constructor:
std::vector<float> values(172490752, 0.0f);
Set the entries directly using the access operator:
values[i] = some_float;
++i;
Upvotes: 1
Reputation: 7443
The reason push_back
is slow is that it will need to copy all the data several times as the vector grows, and even when it doesn’t need to copy data it needs to check. Vectors grow quickly enough that this doesn’t happen often, but it still does happen. A rough rule of thumb is that every element will need to be copied on average once or twice; the earlier elements will need to be copied a lot more, but almost half the elements won’t need to be copied at all.
You can avoid the copying, but not the checks, by calling reserve
on the vector when you create it, ensuring it has enough space. You can avoid both the copying and the checks by creating it with the right size from the beginning, by giving the number of elements to the vector constructor, and then inserting using indexing as Tobias suggested; unfortunately, this also goes through the vector an extra time initializing everything.
If you know the number of floats at compile time and not just runtime, you could use an std::array
, which avoids all these problems. If you only know the number at runtime, I would second Mark’s suggestion to go with std::unique_ptr<float[]>
. You would create it with
size_t size = /* Number of floats */;
auto floats = unique_ptr<float[]>{new float[size]};
You don’t need to do anything special to delete this; when it goes out of scope it will free the memory. In most respects you can use it like a vector, but it won’t automatically resize.
Upvotes: 0
Reputation: 308256
The usual way of speeding up a vector
when you know the size beforehand is to call reserve
on it before using push_back
. This eliminates the overhead of reallocating memory and copying the data every time the previous capacity is filled.
Sometimes for very demanding applications this won't be enough. Even though push_back
won't reallocate, it still needs to check the capacity every time. There's no way to know how bad this is without benchmarking, since modern processors are amazingly efficient when a branch is always/never taken.
You could try resize
instead of reserve
and use array indexing, but the resize
forces a default initialization of every element; this is a waste if you know you're going to set a new value into every element anyway.
An alternative would be to use std::unique_ptr<float[]>
and allocate the storage yourself.
Upvotes: 2
Reputation: 45444
You could use a custom allocator which avoids default initialisation of all elements, as discussed in this answer, in conjunction with ordinary element access:
const size_t N = 172490752;
std::vector<float, uninitialised_allocator<float> > vec(N);
for(size_t i=0; i!=N; ++i)
vec[i] = the_value_for(i);
This avoids (i) default initializing all elements, (ii) checking for capacity at every push, and (iii) reallocation, but at the same time preserves all the convenience of using std::vector
(rather than std::unique_ptr<float[]>
). However, the allocator
template parameter is unusual, so you will need to use generic code rather than std::vector
-specific code.
Upvotes: 1
Reputation: 37597
::boost::container::stable_vector
Notice that allocating a contiguous block of 172 *4 MB might easily fail and requires quite a lot page joggling. Stable vector is essentially a list of smaller vectors or arrays of reasonable size. You may also want to populate it in parallel.
Upvotes: 1