Reputation: 2876
I am trying to insert Chinese text in a DB2 Database but not working.
The database is configured by default as ANSI (en_US 819) (and it's a requirement for other applications that use the dame databse) ALT_COLLATE IDENTITY_16BIT is defined and UNICODE tables are created using CCSID UNICODE but unicode characters for Chinese or Korean are not inserted.
Example table:
CREATE TABLE LANGS (
IDIOMA char(2) NOT NULL,
PAIS char(2) NOT NULL,
TRADUC long varchar NOT NULL,
) CCSID UNICODE;
Example insert:
INSERT INTO LANGS (IDIOMA,PAIS,TRADUC) VALUES ('zh','TW','其他');
System Information:
Example Java extract:
Class.forName("com.ibm.db2.jcc.DB2Driver");
...
Properties props = new Properties();
props.setProperty("user", user);
props.setProperty("password", pass);
props.setProperty("DB2CODEPAGE", "1208");
props.setProperty("retrieveMessagesFromServerOnGetMessage", "true");
con = DriverManager.getConnection(url, props);
...
Statement statement = con.createStatement();
statement.execute(sql);
...
statement.close();
con.close();
DB cfg get
DB2 Database locale configuration
Territorio de base de datos = en_US; Página de códigos de base de datos = 819 Conjunto de códigos de base de datos = iso8859-1 Código de país/región de base de datos = 1 Secuencia de clasificación de base de datos = UNIQUE Orden de clasificación alternativo (ALT_COLLATE) = IDENTITY_16BIT Tamaño de página de base de datos = 4096
Statements are executed correctly and rows appears correctly in the database for:
but not for:
Insert from command line with db2cmd also does not work for this languages (Inserts but with only 1 byte.
Insert from command line in a Linux environment localized as zh_TW works. Insert from command line in a Linux environment localized as en_US.utf-8 works.
Never work on Java on these environments.
Using "X" as prefix form the VARCHAR field is not an option due some restrictions and the SQL works on two environments.
I think it may be some encoding problem on Client, or server due to configuration, file or sql encoding.
Update:
I tried also to load a UTF-8 file with the SQLs. the file loads correctly and debugging the SQL with UTF-8 characters is correctly passed to the Statement but the result is the same.
new InputStreamReader(new FileInputStream(file),"UTF-8")
...
private void executeLineByLine(Reader reader) throws SQLException {
StringBuffer command = new StringBuffer();
try {
BufferedReader lineReader = new BufferedReader(reader);
String line;
while ((line = lineReader.readLine()) != null) {
command = handleLine(command, line);
}
checkForMissingLineTerminator(command);
} catch (Exception e) {
String message = "Error executing: " + command + ". Cause: " + e;
printlnError(message);
throw new SQLException(message, e);
}
}
private StringBuffer handleLine(StringBuffer command, String line) throws SQLException, UnsupportedEncodingException {
String trimmedLine = line.trim();
if (lineIsComment(trimmedLine)) {
println(trimmedLine);
} else if (commandReadyToExecute(trimmedLine)) {
command.append(line.substring(0, line.lastIndexOf(delimiter)));
command.append(LINE_SEPARATOR);
println(command);
executeStatement(command.toString());
command.setLength(0);
} else if (trimmedLine.length() > 0) {
command.append(line);
command.append(LINE_SEPARATOR);
}
return command;
}
private void executeStatement(String command) throws SQLException, UnsupportedEncodingException {
boolean hasResults = false;
Statement statement = connection.createStatement();
hasResults = statement.execute(command);
printResults(statement, hasResults);
statement.close();
}
Update2:
It's not possible to change the data types. The database is part of other systems and already with data.
The database is installed on 7 different servers on three of it that the data is inserted using Linux in a UTF-8 shell the data was inserted correctly from db2 command line.
From windows db2 command line or using Java it's not possible to insert the characters correctly.
Changing the Java sources to UTF-8 source makes the System.out prints the SQL correctly like i see debugging the sql variable.
When i insert this test SQL. It is shown correctly with chines characters in the System.out and in the Statement internal variable
INSERT INTO LANGS (IDIOMA,PAIS,TRADUC) VALUES ('zh','TW','TEST1 其他 FIN TEST1');
But in the database the test appears as:
TEST3 FIN TEST3
HEX reprentation:
54 45 53 54 33 20 1A 1A 1A 1A 1A 1A 1A 1A 20 46 49 4E 20 54 45 53 54 33
T E S T 3 _ ? ? ? ? ? ? ? ? _ F I N _ T E S T 3
I think that probably DB2 Java client is using allways Windows codepage (in this case is ISO-8859-1 or cp1252) instead of UTF-8 or the server is converting the data using the main collate instead the alternative collate of the table.
Update3:
I installed a Java SQL tool called DbVisualizer and using this tool on windows when a paste in the SQL panel the SQL and run it is inserted correctly in the databse.
This makes me to suspect that is not a problem of installation or data types. Probably are one of this three factors.
Upvotes: 1
Views: 14668
Reputation: 2876
Problem is solved using these steps:
Use always db2jcc4.jar not db2jcc.jar (JDBC 4)
Set the environment variable DISABLEUNICODE=0
There is a complete information in this page Understanding DB2 Universal Database character conversion about unicode on DB2
Upvotes: 2