Fetching wrongly encoded data via native JDBC Informix driver

Question

I have Informix database configured with:

DB_LOCALE=pl_pl.CP1250

(Polish locale with Windows CP1250 character encoding).

In this database there is a table with VARCHAR column in which most of the data is encoded in CP1250 but some records are encoded in UTF-8. I suspect they were inserted by ODBC and wrongly encoded .csv import.

When I use ODBC this wrongly encoded data can be fetched. It does not look pretty:

nazw:┼?UKASIK

but can be displayed and end-user can edit such data. Those "strange" chars are UTF-8 characters of 'Ł' letter.

When I use native JDBC driver I cannot fetch such data. Instead of String I got exception:

JDBC Error: -79783
IX000
Kodowanie lub zestaw kodów znaków nie są obsługiwane.

Explanation in English:

-79783 Encoding or code set not supported
Explanation: The encoding or code set entered in the DB_LOCALE or CLIENT_LOCALE variable is not valid.

I created test program in Jython that connect to database using native JDBC driver and JDBC-ODBC bridge. I got exception only with native driver. I also tried to get this data via other JDBC getXXX() methods to get byte[] or Stream but they also raised exceptions. I use JDBC URL as:

jdbc:informix-sqli://test-informix:9088/test:informixserver=ol_testifx;DB_LOCALE=pl_PL.CP1250;CLIENT_LOCALE=pl_PL.CP1250;charSet=CP1250

Server version: IBM Informix Dynamic Server Version 11.50.FC4

Native driver: 3.70.JC5DE; major: 3; minor: 70

ODBC driver used by JDBC-ODBC bridge: 2.0001 (3.70.TC5DE); major: 2; minor: 1

My question is:

Is there any way of getting such wrongly encoded data? I would like to see '?' characters instead of wrongly encoded characters. I don't want exceptions because they do not allow end-users to see and correct wrongly encoded data.

Michał Niklas · Accepted Answer

I got help from IBM Polska and they have found that JDBC connect string can be extended by IFX_USE_STRENC=true: http://www-01.ibm.com/support/docview.wss?uid=swg21502902

This allowed JDBC to fetch wrongly encoded data. Now I can read:

nazw:Przemysław
nazw:Ĺ?UKASIK

(2nd record is with Polish letter Ł in wrong encoding)

Thank you IBM Polska!

Fetching wrongly encoded data via native JDBC Informix driver

Answers (2)

Related Questions