SpaceTrucker
SpaceTrucker

Reputation: 13556

How to match any uppercase letter followed by the corresponding lower case letter?

I have a requirement that says a name must not start with 3 identical letters ignoring their case. A name starts with an upper case letter followed by lower case letters.

Basically I could convert the whole name to upper case and then match with a regex like (\p{Lu})\1{3,}.*.

But I was wondering if there exists a regex that matches the above requirements and does not need any preprocessing of the string to be matched. So what regex can I use to match strings like Aa, Dd or Uu without explicitly specifiying any possible combination?

EDIT:
I accepted Markos answer. I just needed to fix it to work with names of length 1 and two and anchor it at the beginning. So the actual regex for my use case is ^(\p{Lu})(\p{Ll}?$|(?=\p{Ll}{2})(?i)(?!(\1){2})).

I also upvoted the answers of Evgeniy and sp00m for helping me to learn a lesson in regexes.

Thanks for your efforts.

Upvotes: 7

Views: 3247

Answers (6)

creinig
creinig

Reputation: 1580

It might make sense here to use separate checks for the different requirements, especially since requirement lists tend to grow over time.

Your requirements as described are:

A name must not start with 3 identical letters ignoring their case

and

A name starts with an upper case letter followed by lower case letters.

Performing a separate check for each (as described in the other posts) also allows you to give the user proper error messages describing what is actually wrong. And it's certainly more readable.

Upvotes: 0

Marko Topolnik
Marko Topolnik

Reputation: 200148

I admit to rising on the shoulders of giants (the other posters here), but this solution actually works for your use case:

final String[] strings = { "Aba", "ABa", "aba", "aBa", "Aaa", "Aab" }; 
final Pattern p = Pattern.compile("(\\p{Lu})(?=\\p{Ll}{2})(?i)(?!(\\1){2})");
for (String s : strings) System.out.println(s + ": " + p.matcher(s).find());

Now we have:

  1. a match for one upcase char at front;
  2. a lookahead assertion of two lowcase chars following;
  3. another lookahead that asserts these two chars are not both the same (ignoring case) as the first one.

Output:

Aba: true
ABa: false
aba: false
aBa: false
Aaa: false
Aab: true

Upvotes: 3

stema
stema

Reputation: 92976

Evgeniy Dorofeev solution is working (+1), but it can be done simpler, using only a lookahead

(\\p{Lu})(?=\\p{Ll})(?i)\\1

(\\p{Lu}) matches a uppercase character and stores it to \\1

(?=\\p{Ll}) is a positive lookahead assertion ensuring that the next character is a lowercase letter.

(?i) is an inline modifier, enabling case independent matching.

\\1 matches the uppercase letter from the first part (but now case independent because of the modifier in front).

Test it:

String[] TestInput = { "foobar", "Aal", "TTest" };

Pattern p = Pattern.compile("(\\p{Lu})(?=\\p{Ll})(?i)\\1");

for (String t : TestInput) {
    Matcher m = p.matcher(t);
    if (m.find()) {
        System.out.println(t + " ==> " + true);
    } else {
        System.out.println(t + " ==> " + false);
    }
}

Output:

foobar ==> false
Aal ==> true
TTest ==> false

Upvotes: 2

Jim
Jim

Reputation: 469

I have a requirement that says a name must not start with 3 identical letters ignoring their case.

You should use the case-insensitive option: (?i)

and the "catch-all" \w e.g.: (?i)(\w)\1{2,}.*

or just [a-z] e.g.: (?i)([a-z])\1{2,}.*

Upvotes: 1

sp00m
sp00m

Reputation: 48807

This matches any uppercased letter followed by the same letter, uppercased or not:

([A-Z])(?i)\1

This matches any uppercased letter followed by the same letter, but necessarily lowercased:

([A-Z])(?!\1)(?i)\1

For example in Java,

String pattern = "([A-Z])(?!\\1)(?i)\\1";
System.out.println("AA".matches(pattern));
System.out.println("aa".matches(pattern));
System.out.println("aA".matches(pattern));
System.out.println("Aa".matches(pattern));

Prints

false
false
false
true

Upvotes: 2

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 135992

try

    String regex = "(?i)(.)(?=\\p{javaLowerCase})(?<=\\p{javaUpperCase})\\1";
    System.out.println("dD".matches(regex));
    System.out.println("dd".matches(regex));
    System.out.println("DD".matches(regex));
    System.out.println("Dd".matches(regex));

output

false
false
false
true

Upvotes: 2

Related Questions