Reputation: 861
I am trying to split a string and put it into a vector
however, I also want to keep an empty token whenever there are consecutive delimiter:
For example:
string mystring = "::aa;;bb;cc;;c"
I would like to tokenize this string on :; delimiters but in between delimiters such as :: and ;; I would like to push in my vector an empty string;
so my desired output for this string is:
"" (empty)
aa
"" (empty)
bb
cc
"" (empty)
c
Also my requirement is not to use the boost library.
if any could lend me an idea.
thanks
code that tokenize a string but does not include the empty tokens
void Tokenize(const string& str,vector<string>& tokens, const string& delim)
{
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first "non-delimiter".
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters. Note the "not_of"
lastPos = str.find_first_not_of(delimiters, pos);
// Find next "non-delimiter"
pos = str.find_first_of(delimiters, lastPos);
}
}
Upvotes: 7
Views: 2371
Reputation: 48675
I have a version using iterators:
std::vector<std::string> split_from(const std::string& s
, const std::string& d, unsigned r = 20)
{
std::vector<std::string> v;
v.reserve(r);
auto pos = s.begin();
auto end = pos;
while(end != s.end())
{
end = std::find_first_of(pos, s.end(), d.begin(), d.end());
v.emplace_back(pos, end);
pos = end + 1;
}
return v;
}
Using your interface:
void Tokenize(const std::string& s, std::vector<std::string>& tokens
, const std::string& delims)
{
auto pos = s.begin();
auto end = pos;
while(end != s.end())
{
end = std::find_first_of(pos, s.end(), delims.begin(), delims.end());
tokens.emplace_back(pos, end);
pos = end + 1;
}
}
Upvotes: 2
Reputation: 65770
You can make your algorithm work with some simple changes. First, don't skip delimiters at the beginning, then instead of skipping delimiters in the middle of the string, just increment the position by one. Also, your npos
check should ensure that both positions are not npos
so it should be &&
instead of ||
.
void Tokenize(const string& str,vector<string>& tokens, const string& delimiters)
{
// Start at the beginning
string::size_type lastPos = 0;
// Find position of the first delimiter
string::size_type pos = str.find_first_of(delimiters, lastPos);
// While we still have string to read
while (string::npos != pos && string::npos != lastPos)
{
// Found a token, add it to the vector
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Look at the next token instead of skipping delimiters
lastPos = pos+1;
// Find the position of the next delimiter
pos = str.find_first_of(delimiters, lastPos);
}
// Push the last token
tokens.push_back(str.substr(lastPos, pos - lastPos));
}
Upvotes: 5