vizcayno
vizcayno

Reputation: 1233

Does increased buffering improves top speed of the ifstream.getline() in C++?

Just before using MSVC++ input.getline() to read a very big (3GB) delimited text file, I wanted to optimize the speed, incrementing the size of the input buffer:

    ifstream input("in1.txt");
    input.rdbuf()->pubsetbuf(NULL, 1024 * 1024);

However, when executing the code, the speed did not improve, so I would like to know:

Regards.

Upvotes: 1

Views: 1586

Answers (5)

Dmitri Bouianov
Dmitri Bouianov

Reputation: 538

You can try using memory mapped file functionality provided by the OS or, if memory is not an issue, try reading the whole file into memory before processing.

Upvotes: 0

usr
usr

Reputation: 171246

You will get the absolute fastest performance by using CreateFile and ReadFile. Open the file with FILE_FLAGS_SEQUENTIAL_SCAN.

Read with a buffer size that is a power of two. Only benchmarking can determine this number. I have seen it to be 8K once. Another time I found it to be 8M! This varies wildly.

It depends on the size of the CPU cache, on the efficiency of OS read-ahead and on the overhead associated with doing many small writes.

Memory mapping is not the fastest way. It has more overhead because you can't control the block size and the OS needs to fault in all pages.

Upvotes: 0

yves Baumes
yves Baumes

Reputation: 9036

Did you consider the mmap() system call?

The mmap() function shall establish a mapping between a process' address space and a file, shared memory object, or typed memory object. The format of the call is as follows:

pa=mmap(addr, len, prot, flags, fildes, off);

man page

MapViewOfFile is the windows equivalent.

LPVOID WINAPI MapViewOfFile( __in  HANDLE hFileMappingObject,
__in  DWORD dwDesiredAccess, __in  DWORD dwFileOffsetHigh, __in  DWORD dwFileOffsetLow, __in  SIZE_T dwNumberOfBytesToMap );

Upvotes: 3

Ben Voigt
Ben Voigt

Reputation: 283803

I wanted to optimize the speed

Get rid of fstream. iostreams in general are a horrible bottleneck.

Upvotes: -1

111111
111111

Reputation: 16168

The thing about buffering is that it works at many levels, you have library (ifstream) level buffering, you have OS level buffering and hardware level buffering. Changing the size of anyone one of those can have a major or non existent impact on performance.

What is true, is that the 'logic' of the program is going to be much faster than that of the IO.

Personally unless the bottle neck is serious I would leave it be.

Upvotes: 0

Related Questions