AlwaysLearning
AlwaysLearning

Reputation: 8011

Can't get regex_search to find all matches

This is not a duplicate of this or this question, since I am using the newest g++ 6.1.

Here is a simple example I am trying:

int main() {
   std::string data = "a,b,c,d,e,f,g";
   std::smatch m;
   regex_search(data, m, std::regex("(\\w)"));
   std::cout << m.size() << std::endl;
   for (auto i = 0U; i != m.size(); i++)
       std::cout << m.position(i) << " " << m[i].str() << std::endl;
   return 0;
}

This example outputs 2 as the number of matches, while I would expect 7, since each letter in data should match \w. How do I fix this?

Also, both matches point to a at the beginning of the string.

Upvotes: 1

Views: 189

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626825

Here is an excerpt from Finding All Regex Matches at regular-expressions.info:

Construct one object by calling the constructor with three parameters: a string iterator indicating the starting position of the search, a string iterator indicating the ending position of the search, and the regex object. If there are any matches to be found, the object will hold the first match when constructed. Construct another iterator object using the default constructor to get an end-of-sequence iterator. You can compare the first object to the second to determine whether there are any further matches. As long as the first object is not equal to the second, you can dereference the first object to get a match_results object.

So, you can use the following to get matches and their positions:

#include <iostream>
#include <string>
#include <regex>
using namespace std;

int main() {
    std::regex r(R"(\w)");
    std::string s("a,b,c,d,e,f,g");
    for(std::sregex_iterator i = std::sregex_iterator(s.begin(), s.end(), r);
                             i != std::sregex_iterator();
                             ++i)
    {
        std::smatch m = *i;
        std::cout << "Match value: " << m.str() << " at Position " << m.position() << '\n';
    }
    return 0;
}

See the IDEONE demo

Results:

Match value: a at Position 0
Match value: b at Position 2
Match value: c at Position 4
Match value: d at Position 6
Match value: e at Position 8
Match value: f at Position 10
Match value: g at Position 12

The regex is better declared with a raw string literal (R"(\w)" is a \w regex pattern).

Upvotes: 2

Jack
Jack

Reputation: 133577

regex_seach doesn't provide any facility to scan a whole string, it just stops at first match. Luckily <regex> library provided a std::regex_iterator which does the job:

int main() {
   std::string data = "a,b,c,d,e,f,g";
   std::regex exp =  std::regex("(\\w)");

   auto mbegin = std::sregex_iterator(data.begin(), data.end(), exp);
   auto mend = std::sregex_iterator();

   for (auto it = mbegin; it != mend; ++it)
     cout << it->str() << endl;

   return 0;
}

The only caveat is that the lifetime of the std::regex used must match (at least) the one of the iterator, since std::regex_iterator stores a pointer to it internally.

Upvotes: 3

Related Questions