Jacques
Jacques

Reputation: 71

Improving performance of ifstream in c++

I apologize if this question is a bit vague or just plain stupid, I am still very much a novice.

I need to extract information from a web log file in c++. The string manipulations are relatively, accessing the data in a timely fashion isn't. What I am doing currently

string str;

ifstream fh("testlog.log",ios::in);

while (getline(fh,str));

From here I get the useful data from the string. This works fine for a log file with 100 entries, but takes forever on a log file with million+ entries. Any help would greatly be appreciated

Upvotes: 5

Views: 3763

Answers (4)

sehe
sehe

Reputation: 392833

@Errata:

are you sure, that your code would be faster than say:

std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
std::string contents;
std::copy(
        std::istream_iterator<char>(in),
        std::istream_iterator<char>(),
        std::back_inserter(contents));

Also, the OP wants linewise access, which would conveniently be done:

std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
size_t count = std::count_if(
        std::istream_iterator<std::string>(in),
        std::istream_iterator<std::string>(),
        &is_interesting);
std::cout << "Interesting log lines: " << count << std::endl;

of course define a predicate, e.g.

static bool is_interesting(const std::string& line)
{ 
    return std::string::npos != line.find("FATAL");
}

Upvotes: 1

Jacques
Jacques

Reputation: 71

After wasting hours and hours of my time, I compiled the same code in Quincy2005 instead of Microsoft Visual studio. The result was dramatic. From a 40min execution time to 1 min. The some improvement can accomplished in Microsoft Visual Studio by passing a pointer of the filehandler to the getline function. On a Linux based system it takes about 40 sec to execute. I cursed Microsoft for a good 40 min for wasting my time.

Upvotes: 2

Errata
Errata

Reputation: 640

Here the fastest way I found to extract a file :

std::ifstream file("test.txt", std::ios::in | std::ios::end);

std::size_t fileSize = file.tellg();

std::vector<char> buffer(fileSize);

file.seekg(0, std::ios::beg);

file.read(buffer.data(), fileSize);

std::string str(buffer.begin(), buffer.end());

Yet, if your file is really that big, I strongly suggest you to manipulate it as a stream...

Upvotes: 1

Mark B
Mark B

Reputation: 96233

I really suspect that I/O is hurting you more than ifstream here. Have you checked to see that you're actually CPU bound? Most likely you're having disk and cache locality issues.

There may not be a lot you can do in that case.

If it is CPU bound have you profiled to see where the CPU time is going?

Upvotes: 2

Related Questions