Reputation: 119
I am started to learn C++, and my current project should extend my knowledge in using files, split and finally do a regexp on a varchar string.
The problem:
I have a logfile wich contains data like
<date> <time> <username> (<ip:port>) <uuid> - #<id> "<varchar text>"
e.g:
10.03.2016 07:40:38: blacksheep (127.0.0.1:54444) #865 "(this can have text
over several lines
without ending marker"
10.03.2016 07:40:38: blacksheep (127.0.0.1:54444) #865 "A new line, just one without \n"
So I am starting with the following but I am stuck now with how to get the lines with \n
into the string. How can this be solved the right way without unnecessary steps like splitting several times and how can I define where a complete line (even if it's having some \n
within) stops?
With fin.ignore(80, '\n');
, \n
s are being ignored, but this implicates that I will only have one line... Short text before # and a very large string after :-|
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
std::vector<std::string> split(std::string str, char seperator) {
std::vector<std::string> result;
std::string::size_type token_offset = 0;
std::string::size_type seperator_offset = 0;
while (seperator_offset != std::string::npos) {
seperator_offset = str.find(seperator, seperator_offset);
std::string::size_type token_length;
if(seperator_offset == std::string::npos) {
token_length = seperator_offset;
} else {
token_length = seperator_offset - token_offset;
seperator_offset++;
}
std::string token = str.substr(token_offset, token_length);
if (!token.empty()) {
result.push_back(token);
}
token_offset = seperator_offset;
}
return result;
}
int main(int argc, char **argv) {
std::fstream fin("input.dat");
while(!fin.eof()) {
std::string line;
getline(fin, line, ';');
fin.ignore(80, '\n');
std::vector<std::string> strs = split(line, ',');
for(int i = 0; i < strs.size(); ++i) {
std::cout << strs[i] << std::endl;
}
}
fin.close();
return 0;
}
Regards Blacksheep
Upvotes: 0
Views: 116
Reputation: 118340
There is no canned C++ library function for swallowing input like that. std::getline
reads the next line of text, up until the next newline character (by default). That's it. std::getline
does not do any further examination on the input, beyond that.
I will suggest the following approach for you.
Initialize a buffer representing the entire logical line just read.
Read the next line of input, using std::getline
(), and append the line to the input buffer.
Count the number of quote characters in the buffer.
Is the number of quotes even? Stop. If the quote character count is odd, append a newline to the buffer, then go back and read another line of input.
Some obvious optimizations are possible here, of course, but this should be a good start.
Upvotes: 2