Reputation: 2862
I am using BOOST / tokenizer to split a string. It works fine for strings like "1,2,3", but when there are two or more consecutive separators, for example "1,,3,4", it returns "1", "3", "4".
Is there a way to tokenizer returns an empty string "" instead of skip it?
Upvotes: 4
Views: 1437
Reputation: 51871
Boost.Tokenizer's char_separator
class provides the option to output an empty token or to skip ahead with its empty_tokens
parameter. It defaults to boost::drop_empty_tokens
, matching the behavior of strtok()
, but can be told to output empty tokens by providing boost::keep_empty_tokens
.
For example, with the following program:
#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/tokenizer.hpp>
int main()
{
std::string str = "1,,3,4";
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
boost::char_separator<char> sep(
",", // dropped delimiters
"", // keep delimiters
boost::keep_empty_tokens); // empty token policy
BOOST_FOREACH(std::string token, tokenizer(str, sep))
{
std::cout << "<" << token << "> ";
}
std::cout << std::endl;
}
The output is:
<1> <> <3> <4>
Upvotes: 6
Reputation: 8469
I supposed that you have use the split function as below
string text = "1,,3,4";
list<string> tokenList;
split(tokenList, text, is_any_of(","));
BOOST_FOREACH(string t, tokenList)
{
cout << t << "." << endl;
}
If you carefully at the split prototype here you will notice the default parameter at the end !
So now in your call use an explicit token_compress_off
for the last param and it will be ok.
split(tokenList, text, is_any_of(","), token_compress_off);
Upvotes: 4