Reputation: 33
The following regex expression is supposed to match a date in the form of YYYY-MM-DD sandwiched between two non alpha-numeric characters. It's supposed to extract only the date and not the two non-alphanum chars...but it does the opposite. What am I doing wrong. PS i already tried surrounding the [^:alnum:] in a non-capturing group (?:) but it didn't work.
regex exp1("[^:alnum:]([1-9][0-9]{3}(?:-[0-9][1-9]){2})[^:alnum:]")
//or
regex exp1("[^a-zA-Z0-9]([1-9][0-9]{3}(?:-[0-9][1-9]){2})[^a-zA-Z0-9]")
you can also go to this website to try my regex without having to write out c+ code for it. copy&paste the non POSIX bracket expression (without the quotations) if you choose to utilize the site:
#include <regex>
#include <string>
#include <iostream>
#include <vector>
#define isthirty(x) for (int i = 0; i < 3; i++) {if (days[i] == x[1]) {thirty = true;break;}}
using namespace std;
int main() {
vector<string> words;
string str;
getline(cin, str);
int N = stoi(str);
int days[] = { 4,6,9,11 };
regex exp1("[^a-zA-Z0-9]([1-9][0-9]{3}(?:-[0-9][1-9]){2})[^a-zA-Z0-9]");
for (int i = 0; i < N; i++) {
getline(cin, str);
sregex_iterator it(str.cbegin(), str.cend(), exp1);
sregex_iterator end;
for (; it != end; it++) {
words.push_back(it->str(0));
}
}
regex exp2("([0-9])+");
for (auto &it : words) {
int dates[3] = {};
sregex_iterator pos(it.cbegin(), it.cend(), exp2);
sregex_iterator end;
str = it.substr(1,10);
for (int i = 0; pos != end; pos++, i++) {
dates[i] = stoi(pos->str(0));
}
if (dates[0] > 2016 || dates[1] > 12 || dates[2] > 31) {
continue;
}
bool thirty = false;
isthirty(dates);
if (thirty && dates[2] <= 30) {
cout << str << "\n";
}
else if(dates[1] == 2) {
if (dates[0] % 4 == 0 && dates[2] <= 29) {
cout << str << "\n";
}
else if (dates[0] % 4 != 0 && dates[2] <= 28) {
cout << str << "\n";
}
}
else if (dates[2] <= 31) {
cout << str << "\n";
}
}
return 0;
}
Upvotes: 0
Views: 62
Reputation:
Try simplier regexp:
[^0-9]([0-9]{4}-[0-9]{2}-[0-9]{2})[^0-9]
It looks for a non-digit, then the YYYY-MM-DD date, then a non-digit. It captures the date. Works for almost all regexp flavours.
Upvotes: 1
Reputation: 754
In the regex you've provided, the overall regex (a.k.a. group 0) will include the two non-alphanum characters, but capture group 1 should only contain the date you're interested in. So, you could just use your regex as-is and then extract the information from group 1.
If you actually want to change your regex to not include the non-alphanum characters, you need to look into using a "positive lookbehind assertion" for the first group and a "positive lookahead assertion" for the last group. The assertions, even though they kind of look like other groups, don't actually include what they matched in the result.
Upvotes: 0