Gayan Weerakutti
Gayan Weerakutti

Reputation: 13715

Regex to match single letter from A to Z

I want a regex to match a single letter in a string (from A to Z in order):

It should find the letter 'A', if there are no 'A's, it should find the letter 'B', then 'C', and so on...

Examples ->


Extra Information:

I'd provide an example, as some users don't seem to like questions without code.

Given a lower case string remove k characters from that string. First remove all letter 'a', followed by letter 'b', then 'c', etc..

.This was my solution:

public static String remove(String s, int k) {
  for (int c : s.chars().sorted().limit(k).toArray())
    s = s.replaceFirst(Character.toString((char) c), "");
  return s;
}

But I'd like to try this with a regex like:

public static String remove(String s, int k) {
  while (k-- > 0)
    s = s.replaceFirst(MY_MAGIC_REGEX_STR, "");
  return s;
}

Upvotes: 1

Views: 268

Answers (2)

Topaco
Topaco

Reputation: 49251

The following regex works as desired:

(?i)A|B(?!.*[A-A])|C(?!.*[A-B])|D(?!.*[A-C])|E(?!.*[A-D])|F(?!.*[A-E])|G(?!.*[A-F])|H(?!.*[A-G])|I(?!.*[A-H])|J(?!.*[A-I])|K(?!.*[A-J])|L(?!.*[A-K])|M(?!.*[A-L])|N(?!.*[A-M])|O(?!.*[A-N])|P(?!.*[A-O])|Q(?!.*[A-P])|R(?!.*[A-Q])|S(?!.*[A-R])|T(?!.*[A-S])|U(?!.*[A-T])|V(?!.*[A-U])|W(?!.*[A-V])|X(?!.*[A-W])|Y(?!.*[A-X])|Z(?!.*[A-Y])

The regex consists of 26 terms (one term per letter) which are concatenated via the alternation-operator (|). The A(?!B) is the negative look ahead operator which match A if A is not followed by B.The (?i) simply triggers case insensitivity.

On the whole the regex finds first all A's from left to right, than all B's from left to right and so on.

Because of the length of the regex it is more comfortable to generate it programmatically:

// Generate regEx
String regEx = "(?i)" + "A" + "|";  
for (char i = 'B'; i <= 'Z'; i++ ) {
    regEx += i + "(?!.*[A-" + (char)(i-1) + "])" + "|";
}
regEx = regEx.substring(0, regEx.length() - 1);
System.out.println(regEx);

For the following example:

String example = "AAAZZZHHAAAZZHHHAAZZZHH"; 

// Output
while(example.length() != 0) {
    System.out.println(example);
    example = example.replaceFirst(regEx, "");
}

the output is:

AAAZZZHHAAAZZHHHAAZZZHH
AAZZZHHAAAZZHHHAAZZZHH
AZZZHHAAAZZHHHAAZZZHH
ZZZHHAAAZZHHHAAZZZHH
ZZZHHAAZZHHHAAZZZHH
ZZZHHAZZHHHAAZZZHH
ZZZHHZZHHHAAZZZHH
ZZZHHZZHHHAZZZHH
ZZZHHZZHHHZZZHH
ZZZHZZHHHZZZHH
ZZZZZHHHZZZHH
ZZZZZHHZZZHH
ZZZZZHZZZHH
ZZZZZZZZHH
ZZZZZZZZH
ZZZZZZZZ
ZZZZZZZ
ZZZZZZ
ZZZZZ
ZZZZ
ZZZ
ZZ
Z

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521457

Regex might not be the best tool suited for this problem. I think the easiest thing to do here is to just convert your input string to an array of characters, and then walk down that array, keeping track of what the minimum (smallest) character is:

public char findLowestChar(String input) {
    char[] array = input.toCharArray();
    char chr = 'Z';     // works so long as input is non-empty
    for (int i=0; i < array.length; ++i) {
        if (array[i] < chr) {
            chr = array[i];
        }
    }
    return chr;
}

I am assuming here that the input string would always have at least one letter A-Z in it. If not, and you also wanted to implement this inside a method, then you should also handle the empty input case.

Edit:

You just substantially changed your question. But it turns out the above code can still be part of the updated answer. You can now iterate k times, and at each step run the above code to find the lowest letter. Then, do a String#replaceAll to remove all occurrences of that letter.

String input = "BCDAE";
// remove k=4 characters, starting with (maybe) A, from the input string
for (int k=0; k < 4 && input.length() > 0; ++k) {
    char lowest = findLowestChar(input);
    input = input.replaceAll(String.valueOf(lowest), "");
}

Upvotes: 3

Related Questions