Ashley Swatton
Ashley Swatton

Reputation: 2125

Java Regex To Uppercase

So I have a string like

Refurbished Engine for 2000cc Vehicles

I would like to turn this into

Refurbished Engine for 2000CC Vehicles

With capital cc on the 2000CC. I obviously can't do text.replaceAll("cc","CC"); because it would replace all the occurrences of cc with capital versions so the word accelerator would become aCCelerator. In my scenario the leading four digits will always be four digits followed by the letters cc so I figure this can be done with regex.

My question is how in Java can I turn the cc into CC when it follows 4 digits and obtain the result I am expecting above?

String text = text.replaceAll("[0-9]{4}[c]{2}", "?");

Upvotes: 4

Views: 8374

Answers (4)

Pshemo
Pshemo

Reputation: 124225

You can try with

text = text.replaceAll("(\\d{4})cc", "$1CC");
//                          ↓          ↑
//                          +→→→→→→→→→→+

Trick is to place number in group (via parenthesis) and later use match from this group in replacement part (via $x where x is group number).

You can surround that regex with word boundaries "\\b" if you want to make sure that matched text is not part of some other word. You can also use look-adound mechanisms to ensure that there are no alphanumeric characters before and/or after matched text.

Upvotes: 7

Gene
Gene

Reputation: 46960

One way is to trap the numeric part as a group with () and then use a backreference to that group in the substitution:

This is tested:

public static void main(String [] args) {
    String s = "1000cc abc 9999cc";
    String t = s.replaceAll("(\\d{4})cc", "$1CC");
    System.err.println(t);
}

Upvotes: 2

Rohit Jain
Rohit Jain

Reputation: 213253

If you just have to convert cc to uppercase, and if it is fixed, then you can just replace the match with CC.

There is no one-liner generic solution for this in Java. You have to do this with Matcher#appendReplacement() and Matcher#appendTail():

String str = "Refurbished Engine for 2000cc Vehicles";
Pattern pattern = Pattern.compile("\\d{4}cc");
Matcher matcher = pattern.matcher(str);

StringBuffer result = new StringBuffer();
while (matcher.find()) {
    matcher.appendReplacement(result, matcher.group().toUpperCase());
}

matcher.appendTail(result);

System.out.println(result.toString());

Upvotes: 4

Jerry
Jerry

Reputation: 71538

You could perhaps do:

String text = text.replaceAll("(?<=\\b[0-9]{4})cc\\b", "CC");

(?<=\\b[0-9]{4}) is a positive lookbehind that will ensure a match only if cc is preceded by 4 digits (no more than 4 and this rule is enforced by the word boundary \\b (this matches only at the ends of a word, where a word is defined as a group of characters matching \\w+). Also, since lookbehinds are zero-width assertions, they don't count in the match.

If the number of cc's can vary, then it might be easiest checking only one number:

String text = text.replaceAll("(?<=[0-9])cc\\b", "CC");

Upvotes: 2

Related Questions