SNL
SNL

Reputation: 21

CharsetICU java example for char set conversion

I need to convert a file from EBCDIC (IBM 937) to UTF-8. Any idea how I can use the CharsetICU (icu4j API) for charset conversion?

Upvotes: 0

Views: 2614

Answers (3)

Abel ANEIROS
Abel ANEIROS

Reputation: 6464

This is NOT a charset conversion, this is a "transliteration" example using ICU library.

Version: ICU4J 53.1

Package: com.ibm.icu.text.Transliterator

Transliterator.getInstance("Latin-ASCII").transliterate("Your text");

Where: "Latin-ASCII" is the "set of characters" you need (IMPORTANT: this is NOT an encoding). You could check the available IDs using Transliterator.getAvailableIDs();

For "Latin-ASCII":

 Given "123" returns "123"
 Given "abc" returns "abc"
 Given "Š Œ ñ" returns "S OE n" 

Upvotes: 0

Steven R. Loomis
Steven R. Loomis

Reputation: 4350

Think you should be able to use CharsetICU.forNameICU("ibm-937") then you can pass the resulting Charset into a reader/writer.

Upvotes: 1

axtavt
axtavt

Reputation: 242686

There is no need to use external libraries to do this conversion (exception handling omitted):

Reader r = new InputStreamReader(new FileInputStream(...), "IBM937");
Writer w = new OutputStreamWriter(new FileOuputStream(...), "UTF-8");

char[] buf = new char[65536];
int size = 0;

while ((size = r.read(buf)) != -1)
    w.write(buf, 0, size);

r.close();
w.close();

Upvotes: 1

Related Questions