roschach
roschach

Reputation: 9336

Avoid empty elements in match when optional substrings are not present

I am trying to create a regex that match the strings returned by diff terminal command.

These strings start with a decimal number, might have a substring composed by a comma and a number, then a mandatory character (a, c, d) another mandatory decimal number followed by another optional group as the one before.

Examples:

27a27
27a27,30
28c28
28,30c29,31
1d1
1,10d1

I am trying to extract all the groups separately but the optional ones without ,.

I am doing this in C++:

#include<iostream>
#include<string>
#include<fstream>
#include <regex>
using namespace std;

int main(int argc, char* argv[])
{

  string t = "47a46";
  std::string result;
  std::regex re2("(\\d+)(?:,(\\d+))?([acd])(\\d+)(?:,(\\d+))?");
  std::smatch match;
  std::regex_search(t, match, re2);
  cout<<match.size()<<endl;
  cout<<match.str(0)<<endl;

  if (std::regex_search(t, match, re2))
  {
      for (int i=1; i<match.size(); i++)
      {
          result = match.str(i);
          cout<<i<<":"<<result<< " ";
      }
      cout<<endl;
  }

  return 0;
}

The string variable t is the string I want to manipulate. My regular expression

(\\d+)(?:,(\\d+))?([acd])(\\d+)(?:,(\\d+))?

is working but with strings that do not have the optional subgroups (such as 47a46, the match variable will contain empty elements in the corresponding position of the expected substrings.

For example in the program above the elements of match (preceded by their index) are:

1:47 2: 3:a 4:46 5: 

Elements in position 2 and 5 correspond to the optional substring that in this case are not present so I would like match to avoid retrieving them so that it would be:

1:47 2:a 3:46 

How can I do it?

Upvotes: 0

Views: 145

Answers (1)

Dmitry
Dmitry

Reputation: 1293

I think the best RE for you would be like this:

std::regex re2(R"((\d+)(?:,\d+)?([a-z])(\d+)(?:,\d+)?)");

- that way it should match all the required groups (but optional)

output:

4
47a46
1:47 2:a 3:46 

Note: the re2's argument string is given in c++11 notation.

EDIT: simplified RE a bit

Upvotes: 1

Related Questions