Reputation:
I am trying to write one regular expression for string. Let us say there is a string RBY_YBR where _ represents empty so we can recursively replace the alphabets and _ and the result is RRBBYY_ . There can be two or more alphabet pairs can be formed or something like this also RRR .
Conditions
1). Left or right alphabet should be the same.
2). If there is no _ then the alphabet should be like RRBBYY not RBRBYY or RBYRBY etc.
3). There can be more than one underscore _ .
From regular expression I am trying to find whether the given string can satisfy the regular expression or not by replacing the character with _ to form a pattern of consecutive alphabets
The regular expression which I wrote is
String regEx = "[A-ZA-Z_]";
But this regular expression is failing for RBRB. since there is no empty space to replace the characters and RBRB is also not in a pattern.
How could I write the effective regular expression to solve this.
Upvotes: 2
Views: 113
Reputation: 8347
Please take my answer with a grain of salt, since it's a bit of a "Fastest gun in the West" post.
It follows the same assumptions as Florian Albrecht's answer. (thanks)
I believe that this will solve your problem:
(([A-Za-z])(\2|_)+)+
https://regex101.com/r/7TfSVc/1
It works by using the second capturing group and ensuring that more of it follow, or there are underscores.
Known bug: it does not work if an underscore starts a string.
This one is better, though I forgot what I was doing by the end of it.
(([A-Za-z_])(\2|_)+|_+[A-Za-z]_*)+
https://regex101.com/r/7TfSVc/4
Upvotes: 0
Reputation: 2326
Ok, as I understand it, a matching string shall either consist only of same characters being grouped together, or must contain at least one underscore.
So, RRRBBR would be invalid, while RRRRBB, RRRBBR_, and RRRBB_R_ would all be valid.
After comment of question creator, additional condition: Every character must occur 0 or 2 or more times.
As far as I know, this is not possible with Regular Expressions, as Regular Expressions are finite-state machines without "storage". You would have to "store" each character found in the string to check that it won't appear later again.
I would suggest a very simple method for verifying such strings:
public static boolean matchesMyPattern(String s) {
boolean withUnderscore = s.contains("_");
int[] found = new int[26];
for (int i = 0; i < s.length(); i++) {
char ch = s.charAt(i);
if (ch != '_' && (ch < 'A' || ch > 'Z')) {
return false;
}
if (ch != '_' && i > 0 && s.charAt(i - 1) != ch && found[ch - 'A'] > 0
&& !withUnderscore) {
return false;
}
if (ch != '_') {
found[ch - 'A']++;
}
}
for (int i = 0; i < found.length; i++) {
if (found[i] == 1) {
return false;
}
}
return true;
}
Upvotes: 1