Reputation: 1656
I have identified a bottleneck in my c++ code, and my goal is to speed it up. I am moving items from one vector to another vector if a condition is true.
In python, the pythonic way of doing this would be to use a list comprehension:
my_vector = [x for x in data_vector if x > 1]
I have hacked a way to do this in C++, and it is working fine. However, I am calling this millions of times in a while-loop and it is slow. I do not understand much about memory allocation, but I assume that my problem has to do with allocating memory over-and-over again using push_back
. Is there a way to allocate my memory differently to speed up this code? (I do not know how large my_vector
should be until the for
-loop has completed).
std::vector<float> data_vector;
// Put a bunch of floats into data_vector
std::vector<float> my_vector;
while (some_condition_is_true) {
my_vector.clear();
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Use my_vector to render graphics on the GPU, but do not change the elements of my_vector
// Change the elements of data_vector, but not the size of data_vector
}
Upvotes: 0
Views: 4926
Reputation: 73
Though there are various great solutions posted by others for your query, it seems there is still no much explanation for the memory allocation, which you do not much understand, so I would like to share my knowledge about this topic with you. Hope this helps.
Firstly, in C++, there are several types of memory: stack
, heap
, data segment
.
Stack
is for local variables. There are some important features associated with it, for example, they will be automatically deallocated, operation on it is very fast, its size is OS-dependent and small such that storing some KB of data in the stack
may cause an overflow of memory, et cetera.
Heap
's memory can be accessed globally. As for its important features, we have, its size can be dynamically extended if needed and its size is larger(much larger than stack
), operation on it is slower than stack
, manual deallocation of memory is needed (in nowadays's OS, the memory will be automatically freed in the end of program), et cetera.
Data segment
is for global and static variables. In fact, this piece of memory can be divided into even smaller parts, e.g. BBS.
In your case, vector
is used. In fact, the elements of vector
are stored into its internal dynamic array, that is an internal array with a dynamic array size. In the early C++, a dynamic array can be created on the stack
memory, however, it is no longer that case. To create a dynamic array, ones have to create it on heap
. Therefore, the elements of vector
are stored in an internal dynamic array on heap
. In fact, to dynamically increase the size of an array, a process namely memory reallocation
is needed. However, if a vector
user keeps enlarging his or her vector
, then the overhead cost of reallocation
cost will be high. To deal with it, a vector
would firstly allocate a piece of memory that is larger than the current need, that is allocating memory for potential future use. Therefore, in your code, it is not that case that memory reallocation
is performed every time push_back()
is called. However, if the vector
to be copied is quite large, the memory reserved for future use will be not enough. Then, memory allocation
will occur. To tackle it, vector.reserve()
may be used.
I am a newbie. Hopefully, I have not made any mistake in my sharing. Hope this helps.
Upvotes: 2
Reputation: 35745
If you are on Linux you can reserve memory for my_vector
to prevent std::vector
reallocations which is bottleneck in your case. Note that reserve will not waste memory due to overcommit, so any rough upper estimate for reserve value will fit your needs. In your case the size of data_vector
will be enough. This line of code before while
loop should fix the bottleneck:
my_vector.reserve(data_vector.size());
Upvotes: 1
Reputation: 15334
I doubt allocating my_vector
is the problem, especially if the while
loop is executed many times as the capacity of my_vector
should quickly become sufficient.
But to be sure you can just reserve capacity in my_vector
corresponding to the size of data_vector
:
my_vector.reserve(data_vector.size());
while (some_condition_is_true) {
my_vector.clear();
for (auto value : data_vector) {
if (value > 1)
my_vector.push_back(value);
}
}
Upvotes: 1
Reputation: 3355
Use std::copy_if
, and reserve data_vector.size()
for my_vector
initially (as this is the maximum possible number of elements for which your predicate could evaluate to true):
std::vector<int> my_vec;
my_vec.reserve(data_vec.size());
std::copy_if(data_vec.begin(), data_vec.end(), std::back_inserter(my_vec),
[](const auto& el) { return el > 1; });
Note that you could avoid the reserve
call here if you expect that the number of times that your predicate evaluates to true will be much less than the size of the data_vector
.
Upvotes: 7
Reputation: 1777
Run the code twice, first time only counting, how many new elements you will need. Then use reserve
to already allocate all the memory you need.
while (some_condition_is_true) {
my_vector.clear();
int newLength = 0;
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
newLength++;
my_vector.reserve(newLength);
for (i = 0; i < data_vector.size(); i++) {
if (data_vector[i] > 1) {
my_vector.push_back(data_vector[i]);
}
}
// Do stuff with my_vector and change data_vector
}
Upvotes: 1