Reputation: 833
I have imported with Perl a table from our database AS/400 DB2.
The problem is that the string are encoded in EBCDIC Latin-1 (italian language).
How can I convert the resulting file to plain utf-8 in Linux bash?
Upvotes: 3
Views: 18778
Reputation: 49
I had good luck with the following line:
iconv -f IBM037 -t utf-8 input_ebcdic.txt -o output.txt
Upvotes: 2
Reputation: 833
It's simple with iconv
.
iconv -f ISO8859-1 -t "UTF-8" result.csv -o new_result.csv
ISO8859-1 is the Latin-1 encoding format. For a list of encodings, refer t this table from official IBM documentation: https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.nls/doc/nlsgdrf/iconv.htm%23d722e3a267mela
Note that the conversion may leave non valid UTF-8 characters from EBCDIC. An example are NULL characters in the strings. To avoid this, use an HEX editor and replace hex values from 00 to 20 (space character).
Upvotes: 1
Reputation: 70343
Start with
iconv -f EBCDIC-IT -t utf-8 <filename>
then check the output, and if it isn't exactly correct, check man iconv
and the available encodings listed by iconv -l
.
(Note that "EBCDIC Latin-1" is somewhat strange. "Latin-1" indicates ISO-8859-1, while "EBCDIC" is something else entirely. Try file <filename>
to get an educated guess by the computer as to what encoding you are actually looking at.)
Upvotes: 5