Reputation: 46
I have a Japanese client that provide a data feed file in SHift-JIS encoding (with both Kana and Kanji Japanese characters).
I have to upload the data in that Shift-JIS Japanese feed file, into my web application JVM, with startup option as UTF-8 encoding. (-Dfile.encoding=UTF-8
)
The application parses and identifies the various data fields in feed file by character length.
For example, FirstName [Length=30 Characters][Starting Position=11][Ending Position=40]
.
The application parses UTF8 feed files successfully, (which have only English chars) without any issues.
However, when trying to upload the Shift-JIS Japanese feed file, the fields are not identified correctly.
If I change the web application JVM startup option to Shift-JIS (-Dfile.encoding=SJIS
), then the Japanese Shift-JIS feed file is parsed successfully.
The problem is that changing the JVM encoding in the live environment is not possible.
I assume it's the multi-byte representation difference between UTF-8 and Shift-JIS that is causing the web application to fail parsing the Japanese Shift-JIS feed file in UTF8 JVM.
Is there anyway I can convert the characters in Japanese feed file in SHift-JIS encoding, to their equivalent UTF8 encoding? Basically, Japanese characters in ShiftJIS must be converted to the same Japanese characters in UTF8.
Web application back-end is a PostgreSQL DB, encoding UTF8.
Upvotes: 0
Views: 751
Reputation: 46
Found Answer in similar post.
Once ShiftJIS data feed file was converted to equivalent UTF8 encoded file; the UTF8 JVM application parsed the file and its data fields.
Convert Shift_JIS format to UTF-8 format
Upvotes: 1