Computernerd
Computernerd

Reputation: 7766

Convert String to its Unicode code point

Assuming I have a string foo = "This is an apple"

The Unicode code point equivalent will be

" \\x74\\x68\\x69\\x73.......... \\x61\\x70\\x70\\x6c\\x65 "

   T    h    i   s  ............. a    p    p    l   e

How do I convert from String foo

to

String " \\x74\\x68\\x69\\x73.......... \\x61\\x70\\x70\\x6c\\x65 "

Upvotes: 6

Views: 1771

Answers (2)

Paizo
Paizo

Reputation: 4194

Here a working code snippet to make the conversion:

public class HexTest {

    public static void main(String[] args) {

        String testStr = "hello日本語 ";

        System.out.println(stringToUnicode3Representation(testStr));
    }

    private static String stringToUnicode3Representation(String str) {
        StringBuilder result = new StringBuilder();
        char[] charArr = str.toCharArray();
        for (int i = 0; i < charArr.length; i++) {
            result.append("\\u").append(Integer.toHexString(charArr[i] | 0x10000).substring(1));
        }
        return result.toString();
    }   
}

That display:

\u0068\u0065\u006c\u006c\u006f\u65e5\u672c\u8a9e\u0020

If you want to get rid of the extra zeros you elaborate it as described here.

Here another version to do the conversion, by passing "This is an apple" you get

\u54\u68\u69\u73\u20\u69\u73\u20\u61\u6e\u20\u61\u70\u70\u6c\u65

by using:

private static String str2UnicodeRepresentation(String str) {
    StringBuilder result = new StringBuilder();
    for (int i = 0; i < str.length(); i++) {
        int cp = Character.codePointAt(str, i);
        int charCount = Character.charCount(cp);
        //UTF characters may use more than 1 char to be represented
        if (charCount == 2) {
            i++;
        }
        result.append(String.format("\\u%x", cp));
    }
    return result.toString();
}

Upvotes: 0

Hiren
Hiren

Reputation: 1435

try this..

        public static String generateUnicode(String input) {
            StringBuilder b = new StringBuilder(input.length());
            for (char c : input.toCharArray()) {

                b.append(String.format("\\u%04x", (int) c));

            }
            return b.toString();
        }

Upvotes: 1

Related Questions