Pritom
Pritom

Reputation: 1333

Replace non-ascii character by ascii code using java regex

I have string like this T 8.ESTÜTESTतुम मेरी. Now using java regex i want to replace non-ascii character Ü, तुम मेरी with its equivalent code.

How can i achieve this?

I can replace it with any other string.

String str = "T 8.ESTÜTESTतुम मेरी";
String resultString = str.replaceAll("[^\\p{ASCII}]", ""); System.out.println(resultString);

It prints T 8.ESTTEST

Upvotes: 0

Views: 2243

Answers (1)

Leo
Leo

Reputation: 6570

Sorry, I don't know how to do this using a single regex, please check if this works for you

    String str = "T 8.ESTÜTESTतुम मेरी";

    StringBuffer sb = new StringBuffer();
    for(int i=0;i<str.length();i++){
        if (String.valueOf(str.charAt(i)).matches("[^\\p{ASCII}]")){
            sb.append("[CODE #").append((int)str.charAt(i)).append("]");
        }else{
            sb.append(str.charAt(i));
        }
    }
    System.out.println(sb.toString());

prints

T 8.EST[CODE #220]TEST[CODE #2340][CODE #2369][CODE #2350] [CODE #2350][CODE #2375][CODE #2352][CODE #2368]

the problem seems to be how to tell regex how to convert what it finds to the code.

Upvotes: 1

Related Questions