Reputation: 40318
As per my knowledge i know unicode character means every letter has an unique code.
In my database i have set utl8.
Here, i am saving a string (ఉత్తరప్రదేశ్) directly into the database in java.Then it is saved as
ఉత్తరప
్రదేశ్
But the same string i saved in database using
escapeUnicode(StringEscapeUtils.unescapeHtml("here string"));
public String escapeUnicode(String input) {
StringBuilder b = new StringBuilder(input.length());
Formatter f = new Formatter(b);
for (char c : input.toCharArray()) {
if (c < 128) {
b.append(c);
} else {
f.format("\\u%04x", (int) c);
}
}
return b.toString();
}
It is generating unicode as
\u0c09\u0c24\u0c4d\u0c24\u0c30\u0c2a\u0c4d\u0c30\u0c26\u0c47\u0c36\u0c4d
Both are displaying in browser correctly.Why they both are generating different unicodes ? Thanks in advance..
Upvotes: 0
Views: 80
Reputation: 338594
Those are not different numbers…
… and so on.
Two different ways to represent the same Unicode code point.
The first are decimal numbers (base 10). The second are hexadecimal numbers (base 16).
When using a class such as Formatter, sometimes it helps to read the documentation. Then you might understand why you pasted f.format("\\u%04x"
into your code.
Tip: If you have a Mac, download the UnicodeChecker app to see both decimal and hex numbers for each character defined in Unicode.
Upvotes: 3