Loki
Loki

Reputation: 77

C++ spliting string by delimiters and keeping the delimiters in result

I'm looking for a way to split string by multiple delimiters using regex in C++ but without losing the delimiters in output, keeping the delimiters with splitted parts in order, for example:

Input

aaa,bbb.ccc,ddd-eee;

Output

aaa , bbb . ccc , ddd - eee ;

I've found some solutions for this but all in C# or java, looking for some C++ solution, preferably without using Boost.

Upvotes: 4

Views: 5965

Answers (2)

Michael Urman
Michael Urman

Reputation: 15905

You could build your solution on top of the example for regex_iterator. If, for example, you know your delimiters are comma, period, semicolon, and hyphen, you could use a regex that captures either a delimiter or a series of non-delimiters:

([.,;-]|[^.,;-]+)

Drop that into the sample code and you end up with something like this:

#include <iostream>
#include <string>
#include <regex>

int main ()
{
  // the following two lines are edited; the remainder are directly from the reference.
  std::string s ("aaa,bbb.ccc,ddd-eee;");
  std::regex e ("([.,;-]|[^.,;-]+)");   // matches delimiters or consecutive non-delimiters

  std::regex_iterator<std::string::iterator> rit ( s.begin(), s.end(), e );
  std::regex_iterator<std::string::iterator> rend;

  while (rit!=rend) {
    std::cout << rit->str() << std::endl;
    ++rit;
  }

  return 0;
}

Try substituting in any other regular expressions you like.

Upvotes: 12

Avinash Raj
Avinash Raj

Reputation: 174874

For your case, splitting your input string according to the word boundary \b except the one at the first will give you the desired output.

(?!^)\b

DEMO

OR

(?<=\W)(?!$)|(?!^)(?=\W)

DEMO

  • (?<=\W)(?!$) Matches the boundaries which exists next to a non-word character but not the boundary present at the last.

  • | OR

  • (?!^)(?=\W) Matches the boundary which is followed by a non-word character except the one at the start.

Escape the backslash one more time if necessary.

Upvotes: 2

Related Questions