Reputation: 2451
I'm currently trying to make a regex which matches URL parameters and extracts them.
For example, if I got the following parameters string ?param1=someValue¶m2=someOtherValue
, std::regex_match
should extract the following contents:
param1
some_content
param2
some_other_content
After trying different regex patterns, I finally built one corresponding to what I want: std::regex("(?:[\\?&]([^=&]+)=([^=&]+))*")
.
If I take the previous example, std::regex_match
matches as expected. However, it does not extract the expected values, keeping only the last captured values.
For example, the following code:
std::regex paramsRegex("(?:[\\?&]([^=&]+)=([^=&]+))*");
std::string arg = "?param1=someValue¶m2=someOtherValue";
std::smatch sm;
std::regex_match(arg, sm, paramsRegex);
for (const auto &match : sm)
std::cout << match << std::endl;
will give the following output:
param2
someOtherValue
As you can see, param1 and its value are skipped and not captured.
After searching on google, I've found that this is due to greedy capture and I have modified my regex into "(?:[\\?&]([^=&]+)=([^=&]+))\\*?"
in order to enable non-greedy capturing.
This regex works well when I try it on rubular but it does not match when I use it in C++ (std::regex_match
returns false and nothing is captured).
I've tried different std::regex_constants
options (different regex grammar by using std::regex_constants::grep
, std::regex_constants::egrep
, ...) but the result is the same.
Does someone know how to do non-greedy regex capture in C++?
Upvotes: 1
Views: 3440
Reputation: 877
Try to use match_results::prefix/suffix:
string match_expression("your expression");
smatch result;
regex fnd(match_expression, regex_constants::icase);
while (regex_search(in_str, result, fnd, std::regex_constants::match_any))
{
for (size_t i = 1; i < result.size(); i++)
{
std::cout << result[i].str();
}
in_str = result.suffix();
}
Upvotes: 0
Reputation: 2451
As Casimir et Hippolyte explained in his comment, I just need to:
std::regex_iterator
It gives me the following code:
std::regex paramsRegex("[\\?&]([^=]+)=([^&]+)");
std::string url_params = "?key1=val1&key2=val2&key3=val3&key4=val4";
std::smatch sm;
auto params_it = std::sregex_iterator(url_params.cbegin(), url_params.cend(), paramsRegex);
auto params_end = std::sregex_iterator();
while (params_it != params_end) {
auto param = params_it->str();
std::regex_match(param, sm, paramsRegex);
for (const auto &s : sm)
std::cout << s << std::endl;
++params_it;
}
And here is the output:
?key1=val1
key1
val1
&key2=val2
key2
val2
&key3=val3
key3
val3
&key4=val4
key4
val4
The orignal regex (?:[\\?&]([^=&]+)=([^=&]+))*
was just changed into [\\?&]([^=]+)=([^&]+)
.
Then, by using std::sregex_iterator
, I get an iterator on each matching groups (?key1=val1
, &key2=val2
, ...).
Finally, by calling std::regex_match
on each sub-string, I can retrieve parameters values.
Upvotes: 4