Reputation: 1587
I have the following minimal working example, which illustrates how I am currently loading in a set of numbers from a data file "info.txt":
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
int main() {
double temp_var;
vector<double> container_var;
ifstream test("info.txt");
while(test>>temp_var)
{
container_var.push_back(temp_var);
}
cout << container_var[0] << endl;
return 0;
}
The file "info.txt" contains integers of the form
1.0
2.1
3.6
...
I am probably going to load in 50.000-100.000 numbers (maybe even more), so I am interested in doing this efficiently. Is there something fundemental that I have missed in my example that may slow down the loading process?
Upvotes: 0
Views: 241
Reputation: 408
First you need to read the data for which you can ..
a. open the file and read from it
b. alocate memory copy the contents of the file in it
c. memory map the file
Depending on the size of the file i would say that c is the best option because you avoid the cost of alocating and copying the data and it's much faster then naivly reading from the file.
Second you need to parse the contents aparently the best way to do this is a hand rolled loop see http://tinodidriksen.com/2011/05/28/cpp-convert-string-to-double-speed/ for more details .I did try this myself and it's the way to go for large files.
And third ..you need to prealocate the buffer in which you store the result in order minimize alocations .
Of course you need to measure performance ..find hotspot ..eliminate them ... rinse and repeat.
Upvotes: 1
Reputation: 2016
When you are going to add a lot of elements to a std::vector the vector will grow while you add elements to it. When the vector is grown all the data usually needs to copied to the new buffer, you can tell the vector to reserve a lot of space before you add a lot of elements to keep the number of growing and copying operations lower:
std::vector<int> v(5000);
The above will create a vector with 5000 elements already in it (default-initialized). You can reserve more space after construction by calling std::vector::reserve():
std::vector<int> v;
v.reserve(10000); // ensure the vector has a capacity of at least 10k elements
While I think that this is the actual problem, the problem could also be in the line cout << container[0] << endl
. std::endl flushes the files buffer so it is usually slow. The third reason could be that the std::cout
stream is synced with the C-stdio file apis. The synching forces the iostreams library to flush after every character is written. You can disable this synching with:
std::cout.sync_with_stdio(false);
Upvotes: 1
Reputation: 57688
If you know the quantities of numbers ahead of time, you can tell std::vector
to preallocate the space. This will make the push_back
function more efficient.
Other optimization techniques include memory mapped file, and double buffering.
Upvotes: 1