Reputation: 3585
Basic regex question.
By default, regular expression are greedy, it seems. For e.g. below code:
#include <regex>
#include <iostream>
int main() {
const std::string t = "*1 abc";
std::smatch match;
std::regex rgxx("\\*(\\d+?)\\s+(.+?)$");
bool matched1 = std::regex_search(t.begin(), t.end(), match, rgxx);
std::cout << "Matched size " << match.size() << std::endl;
for(int i = 0 ; i < match.size(); ++i) {
std::cout << i << " match " << match[i] << std::endl;
}
}
This will produce an output of:
Matched size 3
**0 match *1 abc**
1 match 1
2 match abc
As an general regular expression writer, I would expected only
1 match 1
2 match abc
to come. First match is coming because of regex greediness, I think. How is it avoidable?
Upvotes: 0
Views: 138
Reputation: 63117
You only have one match. That match has 2 "marked subexpressions", because that's what the regex specifies. You don't have multiple matches of that regex.
From std::regex_search
m.size()
: number of marked subexpressions plus 1, that is,1+rgxx.mark_count()
If you are looking for multiple matches, use std::regex_iterator
Upvotes: 0
Reputation: 60493
From std::regex_search: match[0]
is not the result of greedy evaluation, but is the range of the entire match. The match elements [1, n)
are the capture groups.
Here's in illustration of what the match results mean:
regex "hello ([\\w]+)"
string = "Oh, hello John!"
match[0] = "hello John" // matches the whole regex above
match[1] = "John" // the first capture group
Upvotes: 1