Reputation: 67
I'm new to Regex in Java and I wanted to know how can I build one that only takes a string that consists of one or two comma-separated lists of uppercase letters, separated by a single whitespace.
I would need to filter out strings that start with a comma, that end with a comma or strings that have multiple consecutive commas.
All these would be invalid:
"D,, D"
"D D,,"
"D, ,D"
"D, ,,D"
"D,, ,D"
"D,,"
",,A"
",A"
"A,"
All these would be valid:
"D,D T,F"
"D,D T"
"A,A"
"A"
I used (\s?("[\w\s]*"|\d*)\s?(,,|$))
for consecutive commas but it doesn't do the trick when the comma is at the end or beggining of one of the whitespace separated substring like "D, ,D"
Should I aim to split by whitespace and look for a simpler regex for each of the substrings?
Upvotes: 1
Views: 3595
Reputation: 22977
That would be something like this:
^[A-Z](,[A-Z])*( [A-Z](,[A-Z])*)*$
What happens here, is the following:
Test: https://regex101.com/r/kzLhtw/1
You could, of course, slightly optimize the regex by making all capturing groups non-capturing: just put ?:
immediately behind the (
, that is, (?:
.
Upvotes: 3
Reputation: 75840
"a string that consists of one or two comma-separated lists of uppercase letters, separated by a single whitespace"
Not sure how to exactly interpretate the above, but my reading is: One or two comma-seperated lists where each list may only consist of uppercase characters. In the case of two lists, the two lists are seperated by a single space.
You could try:
^(?!.* .* )[A-Z](?:[ ,][A-Z])*$
See the online demo
^
- Start string anchor.(?!.* .* )
- Negative lookahead to prevent two spaces present.[A-Z]
- A single uppercase alpha-char.(?:
- Open non-capture group:
[ ,]
- A comma or space.[A-Z]
- A single uppercase alpha-char.)*
- Close non-capture group and match 0+ times upt to;$
- End string anchor.Upvotes: 1
Reputation: 163217
You might use
^[A-Z](?: [A-Z])*(?:,[A-Z](?: [A-Z])*){0,2}$
^
Start of string[A-Z]
Match a single char A-Z(?: [A-Z])*
Optionally repeat a space and and a single char A-Z(?:
Non capture group
,[A-Z](?: [A-Z])*
Match a comma, char A-Z followed by optionally repeat matching a space and a char A-Z){0,2}
Close the group and repeat 0-2 times$
End of stringUpvotes: 2