Reputation: 133587
I'm trying to find an efficient way to greedily find the first match for a std::regex
without analyzing the whole input.
My specific problem is that I wrote a hand made lexer and I'm trying to provide rules to parse common literal values (eg. a numeric value).
So suppose a simple let's say
std::regex integralRegex = std::regex("([+-]?[1-9]*[0-9]+)");
Is there a way to find the longest match starting from the beginning of input without scanning all of it? It looks like std::regex_match
tries to match the whole input while std::regex_search
forcefully finds all matches.
Maybe I'm missing a trivial overload for my purpose but I can't find an efficient solution to the problem.
Just to clarify the question: I'm not interested in stopping after first sub-match and ignore the remainder of input but for an input like "51+12*3"
I'd like something that finds first 51
match and then stops, ignoring whatever is after.
Upvotes: 4
Views: 2000
Reputation: 37882
First of all [+-]?[1-9]?[0-9]+
I think it does the same think, but should be a bit faster. Or you intend to use something like this: [+-]?[1-9][0-9]*|0
(zero without sign or number not starting with zero).
Secondly C++ provides regular expression iterator:
const std::string s = "51+12*3";
std::regex number_regex("[+-]?[1-9]?[0-9]+");
auto words_begin =
std::sregex_iterator(s.begin(), s.end(), number_regex);
auto words_end = std::sregex_iterator();
std::cout << "Found "
<< std::distance(words_begin, words_end)
<< " numbers:\n";
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch match = *i;
std::string match_str = match.str();
std::cout << match_str << '\n';
}
And looks like this is what you need.
https://wandbox.org/permlink/tkaAfIslkWeY2poo
Upvotes: 2