Reputation: 10768
This is a basic question and yet I could not find an exact duplicate on SA:
I have this string:
String s = "surname\":\"B\\u00f6rner\"},{\"forename\""
What I'd like to get is:
String s = "surname\":\"Börner\"},{\"forename\""
Any way to do this in Java? Thx!
Upvotes: 3
Views: 1487
Reputation: 200138
This shouldn't be very difficult as long as you don't need the characters outside the Unicode base plane:
final Matcher m = Pattern.compile("\\\\u(.{4})").matcher(
"surname\":\"B\\u00f6rner\"},{\"forename\"");
final StringBuffer b = new StringBuffer();
while (m.find())
m.appendReplacement(b, String.valueOf(((char)Integer.parseInt(m.group(1), 16))));
m.appendTail(b);
System.out.println(b);
Upvotes: 1
Reputation: 159754
Removing the backslash manually will make Java interpret the unicode as such. If you are unable to modify the string that you receive from the API call, you can use:
s = s.replaceAll("\\\\u00f6", "\u00f6");
Upvotes: 1
Reputation: 718698
If that is Java source code, then the two string literals mean EXACTLY the same thing ... provided that (in the latter case) you tell the Java compiler what character set the source file is encoded in. Alternatively, the nativetoascii
command (with the -reverse
command) can be used to convert \uxxxx
unicode escapes in a file to native characters.
If those string values are actually String values, not String literals, then you will need to do some kind of runtime conversion. (I'm sure that there is a 3rd party library method to do this ...)
Upvotes: 0
Reputation: 9417
String s = "surname\":\"B\u00f6rner\"},{\"forename\"" ;
try {
String t = URLDecoder.decode(s, "UTF-8") ;
System.out.println(t) ;
}
catch( Throwable t ) {
t.printStackTrace(System.err) ;
}
Output: surname":"Börner"},{"forename"
You have to find a way to remove extra \ though as others say.
Upvotes: 1