PowerGamer
PowerGamer

Reputation: 2136

Does zero match always "matches" when regex_search returns true?

Here are some quotes from C++11 standard:

28.11.3 regex_search [re.alg.search]

m is a an argument of regex_search of type match_results.

2 Effects: Determines whether the re is some sub-sequence within [first,last) that matches the regular expression e. The parameter flags is used to control how the expression is matched against the character sequence. Returns true if such a sequence exists, false otherwise.

3 Postconditions: m.ready() == true in all cases. If the function returns false, then the effect on parameter m is unspecified except that m.size() returns 0 and m.empty() returns true. Otherwise the effects on parameter m are given in Table 143.

The table 143 states the following about m[0].matched:

true if a match was found, and false otherwise.

The above seems to imply that it is possible for regex_search to return true and at the same time m[0].matched to be false. Can someone please provide an example (of regex pattern and text to match) that shows when it is possible?

In other words, with what values of text and re the following program will not assert:

#include <regex>
#include <cassert>
int main()
{
    char re[] = ""; // what kind of regular expression must it be?
    char text[] = ""; // what kind of input text must it be?
    std::cmatch m;
    assert(std::regex_search(text, m, std::regex(re)) == true);
    assert(m[0].matched == false);
}

Upvotes: 5

Views: 792

Answers (2)

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275878

Table 143 leaks extra information.

If a match was not found, then m.size() is zero, and hence m[0] returns the unmatched sub expression (as 0 >= m.size()), in which case m[0].matched is false.

If a match was found, then m.size() is non-zero, and hence m[0] is the entire matched expression, hence m[0].matched is true. If m.size() is greater than 1, then m[i] for i<m.size() are sub-expressions that are matched by your regular expression.

Had they stated "m[0].matched" is always true, then the reference to Table 143 would still be true (as the reference only occurs when there is a match), but it would be overly confusing.

If you examine re.results (28.10/4) you'll see that unlike most containers, accessing [] beyond .size() is valid on a match.

Upvotes: 1

Praetorian
Praetorian

Reputation: 109279

You're misunderstanding the post-conditions information because the C++11 standard (N3337) contains redundant wording in that section.

If regex_search returns false, meaning a match was not found anywhere within the input string, then the state of the match_results object is unspecified, except for the member functions match_results::ready, which returns true, match_results::size, which returns 0, and match_results::empty, which returns true.

The result of match_results::operator[] is unspecified in that case, and you should not be calling it.

On the other hand, if regex_search returns true, that means a match was found, in which case m[0].matched will always be true. There is no case where it can be false in this situation.

This is clarified in the latest draft N3936, which simply states in Table 143:

m[0].matched | true

The issue report that brought about this wording change can be viewed here. Quoting from it:

There's an analogous probem in Table 143: the condition for m[0].matched is "true if a match was found, false otherwise." But Table 143 gives post-conditions for a successful match, so the condition should be simply "true".

Upvotes: 6

Related Questions