Reputation: 79
I can't get a simple regex to work, right now I have the following java code
String regex = "^([^A-Za-z]*?[A-Z][A-Za-z]*?)+.?";
String string = "AQUA, CETEARYL ALCOHOL, CETYL ESTERS, BEHENTRIMONIUM CHLORIDE, CETRIMONIUM CHLORIDE, AMODIMETHICONE, TRIDECETH-12, PARFUM, METHYLPARABEN, HEXYL CINNAMAL, LINALOOL, BENZYL SALICYLATE, LIMONENE, LAMINARIA DIGITATA, CHAMOMILLA RECUTITA , ANICOZANTHOS FLAVIDUS, SODIUM BENZ0ATE, PHENOXYETHANOL, ETHYLPARABEN, BUTYLPARABEN, PROPYLPARABEN, P0LYS0RBATE 20, CI 19140, CI 14700.";
System.out.println(string.matches(regex));
The problem is that the execution never ends. Please use my regex only to see how I fail. What I need sounds simple to me: - There can be any text. - All words in this text should be upper case. - If there are Single characters, they should be uppercase too. - Anything between (numbers, comma,...) should be matched always. See complex sample above. Simple is:
Test, Test, Test = true Test, test, Test = false Test, 7-Test Test, Test = true Test, 7-Test test, Test = false na = false NA = true N/A = true PHENOXYETHANOL, P0LYS0RBATE 20, CI 19140, CI 14700. = true
Thanks a lot!!!
Upvotes: 1
Views: 3941
Reputation: 33019
This seems to work on all the inputs you provided:
"^((^|[^A-Za-z]+)[A-Z][A-Za-z]*)*[^A-Za-z]*$"
I'm not sure how your validator works, but it doesn't hurt to force matching the full string by adding the ^
and $
symbols on either end.
Your regular expression never terminates because you used too many *
(match zero or more) groups, which made the state space explode. Notice how I use a +
on the [^A-Za-z]
group, which forces it to match at least one non-letter between match groups. This keeps the number of matches to a reasonable number. However, since mine matches a full string (it starts with ^
and ends with $
) it can only find a single match anyway.
Edit:
If you don't want the empty string to match then change the second-to-last *
to a +
:
"^((^|[^A-Za-z]+)[A-Z][A-Za-z]*)+[^A-Za-z]*$"
Upvotes: 1
Reputation: 151
Maybe this regex works for you:
\p{Upper}*[^\p{Lower}]*\p{Upper}*
it means:
\p{Upper} any uppercase character
[^\p{Lower}] any character except lowercase ones
obs: a empty text will matches too
Upvotes: 0
Reputation: 361
This might work for you
String regex = "^([A-Z0-9]+[A-Za-z0-9,./\-]\s)+$";
you may need to add some more separators (,./ and - in the example)
Upvotes: 0
Reputation: 11025
you better use delimiter, for eg with a stringtokenizer and then check, it will be a lot more easier. use ',' as a delimeter and then trim each token and check with regex.
Upvotes: 0