Jean-Denis Muys
Jean-Denis Muys

Reputation: 6842

collecting pattern elements in C++

I need to collect elements from a string that match some pattern. For example, let's have the following URI fragment:

std::string uri = "/api/customer/123/order/456/total";

That is supposed to be matched by the following pattern:

std::string pattern = "/api/customer/:customerNum:/order/:orderNum:/total";

When analyzing that pattern, I want to collect the "variables" in it, ie substrings starting and ending with a colon. The following snippet (adapted from Split a string using C++11) almost does the job:

std::set<std::string> patternVariables(const std::string &uriPattern)
{
    std::regex re(":([^:]+):");            // find a word surrounded by ":"

    std::sregex_token_iterator
    first ( uriPattern.begin(), uriPattern.end(), re),
    last;

    std::set<std::string> comp = {first, last};

    return comp;
}

The problem with that snippet is that it collects the variables including the ":" markers. What would be an idiomatic way to collect the variables without the colons (ie the \1 in the matches, not the matches themselves)? I can manually iterate over the regexp matches and accumulate the matches in a loop, but I suspect there might be something more elegant similar to the {first, last} expression.

Assuming my context is clear, any comment taking it into account is welcome too:

Upvotes: 1

Views: 164

Answers (1)

Jean-Denis Muys
Jean-Denis Muys

Reputation: 6842

Maybe I should remove my question altogether. The class regex_token_iterator has already anticipated that need. The idea is to use the optional 4th parameter to its constructor thus:

std::sregex_token_iterator
first ( uriPattern.begin(), uriPattern.end(), re, 1),
last;

The 1 means "I am interested in the matches 1st subexpression". The default value of 0 means "I am interested in the matches", and -1 means "I am interested in the text between the matches".

(other comments still welcome).

Upvotes: 1

Related Questions