Veritas
Veritas

Reputation: 2210

Why is ifstream::read much faster than using iterators?

As it is, there are many approaches to reading a file into a string. Two common ones are using ifstream::read to read directly to a string and using steambuf_iterators along with std::copy_n:

Using ifstream::read:

std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
in.read(&contents[0], contents.size());

Using std::copy_n:

std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
std::copy_n(std::streambuf_iterator<char>(in), 
            contents.size(), 
            contents.begin();

Many benchmarks show that the first approach is much faster than the second one (in my machine using g++-4.9 it is about 10 times faster with both -O2 and -O3 flags) and I was wondering what may be the reason for this difference in performance.

Upvotes: 7

Views: 1659

Answers (1)

Sebastian Redl
Sebastian Redl

Reputation: 71989

read is a single iostream setup (part of every iostream operation) and a single call to the OS, reading directly into the buffer you provided.

The iterator works by repeatedly extracting a single char with operator>>. Because of the buffer size, this might mean more OS calls, but more importantly it also means repeated setting up and tearing down of the iostream sentry, which might mean a mutex lock, and usually means a bunch of other stuff. Furthermore, operator>> is a formatted operation, whereas read is unformatted, which is additional setup overhead on every operation.

Edit: Tired eyes saw istream_iterator instead of istreambuf_iterator. Of course istreambuf_iterator does not do formatted input. It calls sbumpc or something like that on the streambuf. Still a lot of calls, and using the buffer, which is probably smaller than the entire file.

Upvotes: 2

Related Questions