luca76
luca76

Reputation: 833

Conversion from EBCDIC to UTF8 in Linux

I have imported with Perl a table from our database AS/400 DB2.

The problem is that the string are encoded in EBCDIC Latin-1 (italian language).

How can I convert the resulting file to plain utf-8 in Linux bash?

Upvotes: 3

Views: 18778

Answers (3)

JayBee
JayBee

Reputation: 49

I had good luck with the following line:

iconv -f IBM037 -t utf-8 input_ebcdic.txt -o output.txt

Upvotes: 2

luca76
luca76

Reputation: 833

It's simple with iconv.

iconv -f ISO8859-1   -t "UTF-8" result.csv -o new_result.csv

ISO8859-1 is the Latin-1 encoding format. For a list of encodings, refer t this table from official IBM documentation: https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.nls/doc/nlsgdrf/iconv.htm%23d722e3a267mela

Note that the conversion may leave non valid UTF-8 characters from EBCDIC. An example are NULL characters in the strings. To avoid this, use an HEX editor and replace hex values from 00 to 20 (space character).

Upvotes: 1

DevSolar
DevSolar

Reputation: 70343

Start with

iconv -f EBCDIC-IT -t utf-8 <filename>

then check the output, and if it isn't exactly correct, check man iconv and the available encodings listed by iconv -l.

(Note that "EBCDIC Latin-1" is somewhat strange. "Latin-1" indicates ISO-8859-1, while "EBCDIC" is something else entirely. Try file <filename> to get an educated guess by the computer as to what encoding you are actually looking at.)

Upvotes: 5

Related Questions