Shekhar
Shekhar

Reputation: 11788

How to replace ascii characters by unicode characters?

I get one String from properties file in which uni-code is stored as -uni-000A which is actually \u000A. When I write this \u000A in another file I want to write its corresponding unicode character i.e. \n but my program is writting \u000A instead of \n.

Can anyone please tell how to replace -uni-000A to \u000A and tell program to get its corresponding character?

Upvotes: 0

Views: 1642

Answers (2)

Shekhar
Shekhar

Reputation: 11788

I solved the problem by using methods of StringEscapeUtils class in commons-lang.

Its a two step process :

  1. First escape \u character by using StringEscapeUtils.escapeJava("\\u")
  2. Whenever you want to need actual unicode representation then use StringEscapeUtils.unescapeJava() method.

Giving my sample code here :

String unic = "__UNICODE__000A";
String replaced = unic.replaceAll("__UNICODE__", StringEscapeUtils.escapeJava("\\u"));

// below line prints \u000A
System.out.println("replaced = " + replaced);
String finalVal = StringEscapeUtils.unescapeJava(replaced);

// below line prints actual \n character
System.out.println("final = " + finalVal);

Hope it helps. Thanks everyone for your valuable answers and comments.

Upvotes: 0

Jorge_B
Jorge_B

Reputation: 9872

First of all try to forget the encoding of your source file - once you have read a String, every character in java is treated the same.

Now your problem is to write the characters in your String to bytes in a specific encoding. For that you can use one of the different Writer implementations. Say you need to write your characters in Unicode:

    String myString = ...; /* Wherever it comes from */
    Writer writer = new OutputStreamWriter(
new FileOutputStream("/home/shekhar/myFile"), Charset.forName("UTF-8"));
    writer.write(myString);
    writer.close();

This should make sure the corresponding bytes for an 8 bit Unicode are written into your file.

Upvotes: 1

Related Questions