Reputation: 71
I apologize if this question is a bit vague or just plain stupid, I am still very much a novice.
I need to extract information from a web log file in c++. The string manipulations are relatively, accessing the data in a timely fashion isn't. What I am doing currently
string str;
ifstream fh("testlog.log",ios::in);
while (getline(fh,str));
From here I get the useful data from the string. This works fine for a log file with 100 entries, but takes forever on a log file with million+ entries. Any help would greatly be appreciated
Upvotes: 5
Views: 3763
Reputation: 392833
@Errata:
are you sure, that your code would be faster than say:
std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
std::string contents;
std::copy(
std::istream_iterator<char>(in),
std::istream_iterator<char>(),
std::back_inserter(contents));
Also, the OP wants linewise access, which would conveniently be done:
std::ifstream in("test.txt");
in.unsetf(std::ios::skipws);
size_t count = std::count_if(
std::istream_iterator<std::string>(in),
std::istream_iterator<std::string>(),
&is_interesting);
std::cout << "Interesting log lines: " << count << std::endl;
of course define a predicate, e.g.
static bool is_interesting(const std::string& line)
{
return std::string::npos != line.find("FATAL");
}
Upvotes: 1
Reputation: 71
After wasting hours and hours of my time, I compiled the same code in Quincy2005 instead of Microsoft Visual studio. The result was dramatic. From a 40min execution time to 1 min. The some improvement can accomplished in Microsoft Visual Studio by passing a pointer of the filehandler to the getline function. On a Linux based system it takes about 40 sec to execute. I cursed Microsoft for a good 40 min for wasting my time.
Upvotes: 2
Reputation: 640
Here the fastest way I found to extract a file :
std::ifstream file("test.txt", std::ios::in | std::ios::end);
std::size_t fileSize = file.tellg();
std::vector<char> buffer(fileSize);
file.seekg(0, std::ios::beg);
file.read(buffer.data(), fileSize);
std::string str(buffer.begin(), buffer.end());
Yet, if your file is really that big, I strongly suggest you to manipulate it as a stream...
Upvotes: 1
Reputation: 96233
I really suspect that I/O is hurting you more than ifstream
here. Have you checked to see that you're actually CPU bound? Most likely you're having disk and cache locality issues.
There may not be a lot you can do in that case.
If it is CPU bound have you profiled to see where the CPU time is going?
Upvotes: 2