Reputation: 191
I am weak in writing regular expressions so I'm going to need some help on the one. I need a regular expression that can validate that a string is an set of alphabets (the alphabets must be unique) delimited by comma.
Only one character and after that a comma
Examples:
A,E,R
R,A
E,R
Thanks
Upvotes: 0
Views: 72
Reputation: 70732
You can use a repeated group to validate it's a comma separated string.
^[AER](?:,[AER])*$
To not have unique characters, you would do something like:
^([AER])(?:,(?!\1)([AER])(?!.*\2))*$
Upvotes: 3
Reputation: 25297
We've had several suggestions for this regex:
^([AER],)*[AER]$
Which does indeed work. However, to match a String, it first has to back up one character because it will find that there is no ,
at the end. So we switch it for this to increase performance:
^[AER](,[AER])*$
Notice that this will match a correct String the very first time it attempts to. But also note that we don't need to worry about the ( )*
backing up at all; it will either match the first time, or it won't match the String at all. So we can further improve performance by using a possessive quantifier:
^[AER](,[AER])*+$
This will take the whole String and attempt to match it. If it fails, then it stops, saving time by not doing useless backing up.
If I were trying to ensure the String had no repeated elements, I would not use regex; it just complicates things. You end up with less-readable code (sadly, most people don't understand regex) and, oftentimes, slower code. So I would build my own validator:
public static boolean isCommaDelimitedSet(String toValidate, HashSet<Character> toMatch) {
for (int index = 0; index < toValidate.length(); index++) {
if (index % 2 == 0) {
if (!toMatch.contains(toValidate.charAt(index))) return false;
} else {
if (toValidate.charAt(index) != ',') return false;
}
}
return true;
}
This assumes that you want to be able to pass in a set of characters that are allowed. If you don't want that and have explicit chars you want to match, change the contents of the if (index % 2 == 0)
block to:
char c = toValidate.charAt(index);
if (c == 'A' || c == 'E' || c == 'R' || /* and so on */ ) return false;
Upvotes: 1
Reputation: 15141
Something like this "^([AER],)*[AER]$"
@Edit: regarding the uniqueness, if you can drop the "last character cannot be a comma" requirement (which can be checked before the regex anyway in constant time) then this should work:
"^(?:([AER],?)(?!.*\\1))*$"
This will match A,E,R,
hence you need that check before performing the regex. I do not take responsibility for the performance but since it's only 3 letters anyway...
The above is a java regex obviously, if you want a "pure one" ^(?:([AER],?)(?!.*\1))*$
@Edit2: sorry, missed one thing: this actually requires that check and then you need to add a comma at the end since otherwise it will also match A,E,E
. Kind of limited I know.
Upvotes: 2
Reputation: 56809
My own ugly but extensible solution, which will disallow leading and trailing commas, and checks that the characters are unique.
It uses forward-declared backreference: note how the second capturing group is behind the reference made to it (?!.*\2)
. On the first repetition, since the second capturing group hasn't captured anything, Java treats any attempt to reference text match by second capturing group as failure.
^([AER])(?!.*\1)(?:,(?!.*\2)([AER]))*+$
Demo on regex101 (PCRE flavor has the same behavior for this case)
Test cases:
A,E,R
A,R,E
E,R,A
A
R,E
R
E
A,
A,R,
A,A,R
E,A,E
A,E,E
X,R,E
R,A,E,
,A
AA,R,E
Upvotes: 1
Reputation: 31689
If I understand it correctly, a valid string will be a series (possibly zero long) of two-character patterns, where each pattern is a letter followed by a comma; finally followed at the end by one letter.
Thus:
"^([A-Za-z],)*[A-Za-z]$"
EDIT: Since you've clarified that the letters have to be A, E, or R:
"^([AER],)*[AER]$"
Upvotes: 2