Reputation: 8042
I'm trying to figure out how regex in c++ works so I've did this example where I try different regexp and see if they match or not:
#include <regex>
int main(){
while (true) {
string needle;
cin >> needle;
regex regexp(needle);
std::smatch smatch;
string haystack = "caps.caps[0].MainFormat[0].Video.BitRateOptions = 896, 1536";
bool match = regex_search(haystack, smatch, regexp);
if (match) {
cout << "Matched" << endl;
}
else {
cout << "Mismatch" << endl;
}
}
}
Here are the results:
caps.caps[0].MainFormat[0].Video.BitRateOptions
Mismatch
(caps.caps[0].MainFormat[0].Video.BitRateOptions)
Mismatch
caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions
Matched
(caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions)
Matched
caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions=
Mismatch
(caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions=)
Mismatch
caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions =
Matched
Matched
(caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions =)
THIS ONE BREAK THE PROCESS AND ENDS
caps.caps\[0]
THIS ONE BREAK THE PROCESS AND ENDS
Why caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions =
returns two matches and why capturing this regex crashes the code? Based on this I assume that when I want to match '[' or ']' I need to escape it, and maybe there are some other cases where wrongly constructed regexp might crash the process. Is there any option that will handle unescaped '[' or ']' and other wrong regexp so the code will not crash but rather mismatch? I'm using Visual Studio 2017 on Windows 10. Thanks
Upvotes: 1
Views: 299
Reputation: 74028
The first one
caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions =
returns two matches, because std::cin >> needle;
reads only until the first whitespace character is found (first match). Then it reads the next "word" =
, which gives the second match.
Similar behaviour happens with the next one
(caps\.caps\[0\]\.MainFormat\[0\]\.Video\.BitRateOptions =)
The first part is read (...
excluding the first whitespace. Now the regular expression is incomplete and an exception is thrown.
With g++ this looks like
terminate called after throwing an instance of 'std::regex_error'
what(): regex_error
If you want the complete line, use std::getline
instead
while (std::getline(std::cin, needle)) {
// ...
}
I cannot reproduce any abort with the final one
caps.caps\[0]
This returns a match as expected.
Upvotes: 2