Reputation: 13556
I have a requirement that says a name must not start with 3 identical letters ignoring their case. A name starts with an upper case letter followed by lower case letters.
Basically I could convert the whole name to upper case and then match with a regex like (\p{Lu})\1{3,}.*
.
But I was wondering if there exists a regex that matches the above requirements and does not need any preprocessing of the string to be matched. So what regex can I use to match strings like Aa
, Dd
or Uu
without explicitly specifiying any possible combination?
EDIT:
I accepted Markos answer. I just needed to fix it to work with names of length 1 and two and anchor it at the beginning. So the actual regex for my use case is ^(\p{Lu})(\p{Ll}?$|(?=\p{Ll}{2})(?i)(?!(\1){2}))
.
I also upvoted the answers of Evgeniy and sp00m for helping me to learn a lesson in regexes.
Thanks for your efforts.
Upvotes: 7
Views: 3247
Reputation: 1580
It might make sense here to use separate checks for the different requirements, especially since requirement lists tend to grow over time.
Your requirements as described are:
A name must not start with 3 identical letters ignoring their case
and
A name starts with an upper case letter followed by lower case letters.
Performing a separate check for each (as described in the other posts) also allows you to give the user proper error messages describing what is actually wrong. And it's certainly more readable.
Upvotes: 0
Reputation: 200148
I admit to rising on the shoulders of giants (the other posters here), but this solution actually works for your use case:
final String[] strings = { "Aba", "ABa", "aba", "aBa", "Aaa", "Aab" };
final Pattern p = Pattern.compile("(\\p{Lu})(?=\\p{Ll}{2})(?i)(?!(\\1){2})");
for (String s : strings) System.out.println(s + ": " + p.matcher(s).find());
Now we have:
Output:
Aba: true ABa: false aba: false aBa: false Aaa: false Aab: true
Upvotes: 3
Reputation: 92976
Evgeniy Dorofeev solution is working (+1), but it can be done simpler, using only a lookahead
(\\p{Lu})(?=\\p{Ll})(?i)\\1
(\\p{Lu})
matches a uppercase character and stores it to \\1
(?=\\p{Ll})
is a positive lookahead assertion ensuring that the next character is a lowercase letter.
(?i)
is an inline modifier, enabling case independent matching.
\\1
matches the uppercase letter from the first part (but now case independent because of the modifier in front).
Test it:
String[] TestInput = { "foobar", "Aal", "TTest" };
Pattern p = Pattern.compile("(\\p{Lu})(?=\\p{Ll})(?i)\\1");
for (String t : TestInput) {
Matcher m = p.matcher(t);
if (m.find()) {
System.out.println(t + " ==> " + true);
} else {
System.out.println(t + " ==> " + false);
}
}
Output:
foobar ==> false
Aal ==> true
TTest ==> false
Upvotes: 2
Reputation: 469
I have a requirement that says a name must not start with 3 identical letters ignoring their case.
You should use the case-insensitive option: (?i)
and the "catch-all" \w
e.g.: (?i)(\w)\1{2,}.*
or just [a-z]
e.g.: (?i)([a-z])\1{2,}.*
Upvotes: 1
Reputation: 48807
This matches any uppercased letter followed by the same letter, uppercased or not:
([A-Z])(?i)\1
This matches any uppercased letter followed by the same letter, but necessarily lowercased:
([A-Z])(?!\1)(?i)\1
For example in Java,
String pattern = "([A-Z])(?!\\1)(?i)\\1";
System.out.println("AA".matches(pattern));
System.out.println("aa".matches(pattern));
System.out.println("aA".matches(pattern));
System.out.println("Aa".matches(pattern));
Prints
false
false
false
true
Upvotes: 2
Reputation: 135992
try
String regex = "(?i)(.)(?=\\p{javaLowerCase})(?<=\\p{javaUpperCase})\\1";
System.out.println("dD".matches(regex));
System.out.println("dd".matches(regex));
System.out.println("DD".matches(regex));
System.out.println("Dd".matches(regex));
output
false
false
false
true
Upvotes: 2