Zesa Rex
Zesa Rex

Reputation: 482

Using RegEx to filter wrong Input?

Look at this example:

string str = "January 19934";

The Outcome should be

Jan 1993

I think I have created the right RegEx ([A-z]{3}).*([\d]{4}) to use in this case but I do not know what I should do now?

How can I extract what I am looking for, using RegEx? Is there a way like receiving 2 variables, the first one being the result of the first RegEx bracket: ([A-z]{3}) and the second result being 2nd bracket:[[\d]{4}]?

Upvotes: 1

Views: 78

Answers (2)

Drako
Drako

Reputation: 768

This could work.

([A-Za-z]{3})([a-z ])+([\d]{4})

Note the space after a-z is important to catch space.

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626929

Your regex contains a common typo: [A-z] matches more than just ASCII letters. Also, the .* will grab all the string up to its end, and backtracking will force \d{4} match the last 4 digits. You need to use lazy quantifier with the dot, *?.

Then, use regex_search and concat the 2 group values:

#include <regex>
#include <string>
#include <iostream>
using namespace std;

int main() {
    regex r("([A-Za-z]{3}).*?([0-9]{4})");
    string s("January 19934");
    smatch match;
    std::stringstream res("");
    if (regex_search(s, match, r)) {
        res << match.str(1) << " " << match.str(2);
    }
    cout << res.str();  // => Jan 1993
    return 0;
}

See the C++ demo

Pattern explanation:

  • ([A-Za-z]{3}) - Group 1: three ASCII letters
  • .*? - any 0+ chars other than line break symbols as few as possible
  • ([0-9]{4}) - Group 2: 4 digits

Upvotes: 3

Related Questions