wrongElk
wrongElk

Reputation: 91

C++11 Regex submatches

I have the following code to extract the left & right part from a string of type

[3->1],[2->2],[5->3]

My code looks like the following

#include <iostream>
#include <regex>
#include <string>

using namespace std;

int main()
{
    regex expr("([[:d:]]+)->([[:d:]]+)"); 
    string input = "[3->1],[2->2],[5->3]";

    const std::sregex_token_iterator end;
    int submatches[] = { 1, 2 };
    string left, right;

    for (std::sregex_token_iterator itr(input.begin(), input.end(), expr, submatches); itr != end;)
    {
        left    = ((*itr).str()); ++itr;
        right   = ((*itr).str()); ++itr;

        cout << left << "      " << right << endl;
    }
}

Output will be

3      1
2      2
5      3

Now I am trying to extend it so that first part will be a string instead of digit. For example, the input will be

[(3),(5),(0,1)->2],[(32,2)->6],[(27),(61,11)->1]

And I need to split it as

(3),(5),(0,1)    2
(32,2)           6
(27),(61,11)     1

Basic expressions that I tried ("(\\(.*+)->([[:d:]]+)") just splits the entire string to two as following

(3),(5),(0,1)->2],[(32,2)->6],[(27),(61,11)      1

Can somebody give me some suggestions on how to achieve this? Appreciate all the help.

Upvotes: 2

Views: 587

Answers (2)

user
user

Reputation: 675

You need to get everything after the first '[', except "->", kind of like if you were doing a regex for the multiline comment /* ... */, where " */ " has to be excluded, or else the regex gets greedy and eats everything until the last one, like is happening in your case for "->". You can't really use the dot for any char, because it gets very greedy.

This works for me:

\\[([^-\\]]+)->([0-9]+)\\]

'^' at the start of [...] makes it so all chars, except '-', so you can avoid "->", and ']', are accepted

Upvotes: 2

Thomas Ayoub
Thomas Ayoub

Reputation: 29441

What you need is to make it a bit more specific:

\[([^]]*)->([^]]*)\]

In order to avoid capturing too many data. See live demo.

You could have use the .*? pattern instead of [^]]* but it would have been less efficient.

Upvotes: 2

Related Questions