osilverstone96
osilverstone96

Reputation: 5

Trying to find number of regex matches in C++, returning zero

I have implemented the following code to try and count the number of matches in the given string; in this case it should be 1.

#include <iostream>
#include <string>
#include <regex>

unsigned countMatches(std::string text, std::regex expr);
    
int main()
{
    std::string phrase = "Hello world";

    std::regex pattern = std::regex("world");

    std::cout << countMatches(phrase, pattern) << std::endl;
    
    return 0;
}

unsigned countMatches(std::string text, std::regex expr)
{
    std::smatch matches;
    
    while(std::regex_search(text, matches, expr))
    text = matches.suffix().str();
   
    return matches.size();
}

However it always prints 0 and I can't see why.

Upvotes: 0

Views: 118

Answers (3)

A M
A M

Reputation: 15267

There are dedicated functions to solve this problem with a typical one-liner.

  1. std::sregex_token_iteraor. This can iterate over tokens, defined by a std::regex in a std::string. Please see here.
  2. std::distance. This will take 2 iterators and counts the hops from the first iterator to the second iterator. Please see here.

With the available constructors, we set the iterator to the first element and the last element. The last element will be set with the empty default constructor (number 1) using the default initializer {}.

std::distance will then count the hops between the first found token and the last found token. And that is the number of overall found tokens. This is your result.

And with the above we get the following one-liner:

#include <iostream>
#include <string>
#include <regex>
#include <iterator>

const std::string s{ "abcabcd abcde xab a b"};
const std::regex re{ "ab" };

int main() {
    std::cout << std::distance(std::sregex_token_iterator(s.begin(), s.end(), re), {});
}

Upvotes: 0

wohlstad
wohlstad

Reputation: 28094

However it always prints 0 and I can't see why.

You call regex_search in a while loop. The body of the loop (eventhough wrongly indented) is updating text. The first iteration does find 1 match. But then you update text to be an empty string (the suffix of the match) and in the next iteration there are 0 matches. This is the value your function returns.

Instead you should accumulate the number of matches:

#include <iostream>
#include <string>
#include <regex>

size_t countMatches(std::string text, std::regex const & expr)
{
    std::smatch matches;
    size_t result{ 0 };

    while (std::regex_search(text, matches, expr))
    {
        result += matches.size();  // <--- accumulate number of matches
        text = matches.suffix().str();
    }

    return result;
}

int main()
{
    std::string phrase = "Hello world";
    std::cout << countMatches(phrase, std::regex("world")) << std::endl;
    std::cout << countMatches(phrase, std::regex("o")) << std::endl;
    return 0;
}

Output:

1
2

Note that I changed the return value to size_t as this is the type of matches.size(). I also added const & to the expr parameter to avoid a copy.

Upvotes: 1

rturrado
rturrado

Reputation: 8064

If you just want to iterate over the input string, and count a number of matches, you can use std::sregex_iterator.

[Demo]

unsigned countMatches(const std::string& text, std::regex expr) {
    unsigned ret{};
    for (auto it = std::sregex_iterator(text.begin(), text.end(), expr);
        it != std::sregex_iterator{};
        ++it, ++ret);
    return ret;
}

// Outputs: 1

Upvotes: 0

Related Questions