Reputation: 59
I have an application that get som Strings by JSON.
The problem is that I think that they are sending it as ASCII and the text really should be in unicode.
For example, there are parts of the string that is "\u00f6" which is the swedish letter "ö"
For example the swedish word for "buy" is "köpa" and the string I get is "k\u00f6pa"
Is there an easy way for me after I recived this String in java to convert it to the correct representation?
That is, I want to convert strings like "k\u00f6pa" to "köpa"
Thank for all help!
Upvotes: 2
Views: 654
Reputation: 43391
The hex code is just 2 bytes of integer, which an int
can handle just fine -- so you can just use Integer.parse(s, 16)
where s
is the string without the "\u"
prefix. Then you just narrow that int
to a char
, which is guaranteed to fit.
Throw in some regex (to validate the string and also extract the hex code), and you're all done.
Pattern p = Pattern.compile("\\\\u([0-9a-fA-F]{4})");
Matcher m = p.matcher(arg);
if (m.matches()) {
String code = m.group(1);
int i = Integer.parseInt(code, 16);
char c = (char) i;
System.out.println(c);
}
Upvotes: 0
Reputation: 121692
Well, that is easy enough, just use a JSON library. With Jackson for instance you will:
final ObjectMapper mapper = new ObjectMapper();
final JsonNode node = mapper.readTree(your, source, here);
The JsonNode
will in fact be a TextNode
; you can just retrieve the text as:
node.textValue()
Note that this IS NOT an "ASCII representation" of a String; it just happens that JSON strings can contain UTF-16 code unit character escapes like this one.
(you will lose the quotes around the value, though, but that is probably what you expect anyway)
Upvotes: 1