Paweł Krupiński
Paweł Krupiński

Reputation: 1426

UTF-8 to ISO-8859-1 mapping / lossless conversion libraries in Java

I need to perform a conversion of characters from UTF-8 to ISO-8859-1 in Java without losing for example all of the UTF-8 specific punctuation.
Ideally would like these to be converted to equivalents in ISO (e.g. there are probably 5 different single quotes in UTF-8 and would like them all converted to ISO single quote character).

String.getBytes("ISO-8859-1") just won't do the trick in this case as it will lose the UTF-8-specific chars.

Do you know of any ready mappings or libraries in Java that would map UTF-8 specific characters to ISO?

Upvotes: 2

Views: 2993

Answers (3)

Have you considered using an OutputStream with an explicit character set of ISO-8859-1?

Then just write your Unicode chars and see what you get.

Upvotes: 1

richj
richj

Reputation: 7529

The Java Development Kit has a tool called native2ascii that will do this. Use:

native2ascii -encoding UTF-8 [ inputfile [ outputfile ] ]

You can also go back the other way using the -reverse option.

Also see the list of supported encodings for JDK 1.6.

Upvotes: 0

beny23
beny23

Reputation: 35008

IBM's ICU project might be what you're looking for. It has support for fallback conversions.

Upvotes: 2

Related Questions