Reputation: 681
I have a string in UTF-8 format. I want to convert it to clean ANSI format. How to do that?
Upvotes: 1
Views: 30086
Reputation: 14591
You could use a java function like this one here to convert from UTF-8 to ISO_8859_1 (which seems to be a subset of ANSI):
private static String convertFromUtf8ToIso(String s1) {
if(s1 == null) {
return null;
}
String s = new String(s1.getBytes(StandardCharsets.UTF_8));
byte[] b = s.getBytes(StandardCharsets.ISO_8859_1);
return new String(b, StandardCharsets.ISO_8859_1);
}
Here is a simple test:
String s1 = "your utf8 stringáçﬠ";
String res = convertFromUtf8ToIso(s1);
System.out.println(res);
This prints out:
your utf8 stringáç?
The ﬠ character gets lost because it cannot be represented with ISO_8859_1 (it has 3 bytes when encoded in UTF-8). ISO_8859_1 can represent á and ç.
Upvotes: 3
Reputation: 343
You can do something like this:
new String("your utf8 string".getBytes(Charset.forName("utf-8")));
in this format 4 bytes of UTF8
converts to 8 bytes of ANSI
Upvotes: 2
Reputation: 3124
Converting UTF-8 to ANSI is not possible generally, because ANSI only has 128 characters (7 bits) and UTF-8 has up to 4 bytes. That's like converting long to int, you lose information in most cases.
Upvotes: 0