Generic Name
Generic Name

Reputation: 1270

Match at most n characters

I'm am trying to make a regex expression that matches at most 7 groups.

((X:){1,6})((:Y){1,6})

X:X:X:X:X::Y:Y             This should match
X:X:X:X:X:X::Y:Y           This should not match.

https://regex101.com/r/zxfAB7/16

Is there any way to do this? I need the capture group $1 and $3
I am using C++17 regex.

Upvotes: 0

Views: 200

Answers (3)

A M
A M

Reputation: 15265

Although there is already an accepted answer, I would like to show an ultra simple straightforward solution. Tested with C++17. And a complete running source code.

Since we are talking about max 7 groups, we can simply list them all up and 'or' them. This is maybe much text and a complex DFA. But it should work.

After we found the match, we define a vector and put all data/groups into it and show the desired result. This is really simple:

Please see:

#include <iostream>
#include <string>
#include <iterator>
#include <vector>
#include <regex>

std::vector<std::string> test{
    "X::Y",
    "X:X::Y",
    "X:X::Y:Y",
    "X:X:X::Y:Y",
    "X::Y:Y:Y:Y:Y",
    "X:X:X:X:X::Y:Y",
    "X:X:X:X:X:X::Y:Y"
};

const std::regex re1{ R"((((X:){1,1}(:Y){1,6})|((X:){1,2}(:Y){1,5})|((X:){1,3}(:Y){1,4})|((X:){1,4}(:Y){1,3})|((X:){1,5}(:Y){1,2})|((X:){1,6}(:Y){1,1})))" };
const std::regex re2{ R"(((X:)|(:Y)))" };

int main() {
    std::smatch sm;
    // Go through all test strings
    for (const std::string s : test) {
        // Look for a match
        if (std::regex_match(s, sm, re1)) {
            // Show succes message
            std::cout << "Match found for  -->  " << s << "\n";
            // Get all data (groups) into a vector
            std::vector<std::string> data{ std::sregex_token_iterator(s.begin(), s.end(),re2,1),  std::sregex_token_iterator() };
            // Show desired groups
            if (data.size() >= 6) {
                std::cout << "Group 1: '" << data[0] << "'   Group 6: '" << data[5] << "'\n";
            }
        }
        else {
            std::cout << "**** NO match found for  -->  " << s << "\n";
        }
    }
    return 0;
}

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163577

If a positive lookahead is supported, you might use a positive lookahead to assert not 8 repetitions of either X: or :Y.

To prevent an empty match you could use a positive lookahead to check if there is at least 1 match.

Then use 2 capturing groups where you repeat 0+ times either matching X: in the first group a and 0+ times matching :Y in the other group.

^(?=(?:X:|:Y))(?!(?:(?:X:|:Y)){8})((?:X:)*)((?::Y)*)$
  • ^ Start of string
  • (?= Postive lookahead, assert what is on the right is
    • (?:X:|:Y) Match either X: or :Y
  • ) Close positive lookahead
  • (?! Negative lookahead, assert not 8 times matching either X: or :Y
    • (?:(?:X:|:Y)){8}
  • ) close negative lookahead
  • ((?:X:)*) Capture group 1 Match 0+ times X:
  • ((?::Y)*) Capture group 2 Match 0+ times :Y
  • $ End of string

Regex demo

Upvotes: 1

Dominique
Dominique

Reputation: 17565

As mentioned by Ulrich, just using regular expressions might not be the solution. I'd advise you the following:

Replace all X (occuring 1 to 6 times) by an empty string
Replace all Y (occuring 1 to 6 times) by an empty string
Use regex for determining if any X is still present
Use regex for determining if any Y is still present

In case all X or Y appear only 1 to 6 times, no X or Y will be found (return match), else return no match.

Upvotes: 0

Related Questions