Reputation: 149
I have tried with both commented and uncomented version of the code:
string separator1(""); //dont let quoted arguments escape themselves
string separator2(",\n"); //split on comma and newline
string separator3("\"\'"); //let it have quoted arguments
escaped_list_separator<char> els(separator1, separator2, separator4);
tokenizer<escaped_list_separator<char>> tok(str);//, els);
for (tokenizer<escaped_list_separator<char>>::iterator beg = tok.begin();beg!= tok.end(); ++beg) {
next = *beg;
boost::trim(next);
cout << counter << " " << next << endl;
counter++;
}
to separate a file which has the following format:
12345, Test Test, Test
98765, Test2 test2, Test2
This is the output
0 12345
1 Test Test
2 Test
98765
3 Test2 test2
4 Test2
I am not sure where the problem is but what I need to achieve is to have a number 3 before 98765
Upvotes: 3
Views: 1326
Reputation: 392911
Looks to me you are parsing, not splitting.
Using a parser generator would be superior IMO
#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
int main() {
boost::spirit::istream_iterator f(std::cin >> std::noskipws), l;
std::vector<std::string> columns;
qi::parse(f, l, +~qi::char_(",\r\n") % (qi::eol | ','), columns);
size_t n = 0;
for(auto& tok : columns) { std::cout << n++ << "\t" << tok << "\n"; }
}
Prints
0 12345
1 Test Test
2 Test
3 98765
4 Test2 test2
5 Test2
Frankly I think it's superior because it will allow you write
phrase_parse(f, l, (qi::_int >> *(',' >> +~qi::char_("\r\n,")) % qi::eol, qi::blank...);
And get proper parsing of the data types, whitespace skipping etc. for "free"
Upvotes: 0
Reputation: 3305
You forgot the newline separator: string separator2(",\n");
#include <iostream>
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>
using namespace std;
using namespace boost;
int main() {
string str = "TEst,hola\nhola";
string separator1(""); //dont let quoted arguments escape themselves
string separator2(",\n"); //split on comma and newline
string separator3("\""); //let it have quoted arguments
escaped_list_separator<char> els(separator1, separator2, separator3);
tokenizer<escaped_list_separator<char>> tok(str, els);
int counter = 0, current_siding = 0, wagon_pos = 0, cur_vector_pos = 0;
string next;
for (tokenizer<escaped_list_separator<char>>::iterator beg = tok.begin(); beg != tok.end(); ++beg) {
next = *beg;
boost::trim(next);
cout << counter << " " << next << endl;
counter++;
}
return 0;
}
Upvotes: 2