user1142317
user1142317

Reputation: 583

String to Unicode in Java

I have a large string I need to convert all the non alphanumeric chars to unicode

For example

Input string : abc12/dad-das/das_sdj

Output String : abc12:002Fdad:002Ddas:002Fdas:002Fsdj

Currently I am using this function

for (char c : str.toCharArray()) {
    System.out.printf(":%04X \n", (int) c);
}

Is there a better way to do it ?

Upvotes: 0

Views: 328

Answers (1)

Andreas
Andreas

Reputation: 159096

Here are two ways to do it:

// Looping over string characters
private static String convert(String input) {
    StringBuilder buf = new StringBuilder(input.length() + 16);
    for (int i = 0; i < input.length(); i++) {
        char c = input.charAt(i);
        if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9'))
            buf.append(c);
        else
            buf.append(String.format(":%04X", (int) c));
    }
    return buf.toString();
}
// Using regular expression
private static String convert(String input) {
    StringBuffer buf = new StringBuffer(input.length() + 16);
    Matcher m = Pattern.compile("[^a-zA-Z0-9]").matcher(input);
    while (m.find())
        m.appendReplacement(buf, String.format(":%04X", (int) m.group().charAt(0)));
    return m.appendTail(buf).toString();
}

Test

System.out.println(convert("abc12/dad-das/das_sdj"));

Output

abc12:002Fdad:002Ddas:002Fdas:005Fsdj

Upvotes: 3

Related Questions