Reputation: 11
I have a code like:
#include <iostream>
#include <chrono>
#include <time.h>
#include <vector>
#include <array>
int main()
{
using Array2d4 = std::array<std::array<double, 2>, 4>;
using Array2d3 = std::array<std::array<double, 2>, 3>;
Array2d4 B;
B.fill({});
std::vector<Array2d3> B1;
Array2d3 T1;
//B1.reserve(400000);
auto start1 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < 100000; i++) {
for (std::size_t j = 0; j < 4; ++j)
{
T1[0][0] = B[j][0];
T1[0][1] = 0;
T1[1][0] = 0;
T1[1][1] = B[j][1];
T1[2][0] = B[j][0];
T1[2][1] = B[j][1];
B1.push_back(T1);
}
}
auto finish1 = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> elapsed1 = finish1 - start1;
std::cout << "Elapsed time: " << elapsed1.count() << " s\n";
}
My gut feeling is slow as the elapsed time shows:
Elapsed time: 0.020
I learned that it should be faster if you preallocate before using push_back, however even when I use the reserve function (commented in the code):
Elapsed time: 0.018
This is still not fast enough. Any way to speed it up? It is also acceptable to change the structure.
Upvotes: 1
Views: 253
Reputation: 475
I ran your code with and without reserve:
Best of 3 without resreve: Elapsed time: 0.0218509 s
Best of 3 with reserve: Elapsed time: 0.00879622 s
As far as I see it gets roughly twice as fast, which kind of makes sense considering that push_back increases the capacity to 1.5 times the previous one in case the allocated memory is not sufficient. , therefore there are rouglhy 31 unnecessary copies without reserve.
All in all this means that roughly unnecessary copies are made of double arrays.
While your original code is doing 400000*6=2400000 assignments to double values. If the array copies were only copies of the contained 2 double values, than it would add up to roughly 172000 double value copies so at least they seem to be in the same magnitude according to this approximation.
Also keep in mind that my approximation regarding the necessary numbers of allocations without using reserve is underestimating the required re-allocations since capacity is always an integer so when the reallocation happens, only int(capacity * 1.5) memory gets allocated which is less or equal to capacity * 1.5.
Have you recompiled the code before running the tests?
Upvotes: 2