Pokuri
Pokuri

Reputation: 3082

utf-8 to utf-16 conversion in oracle

I actually stored resources in DB in UTF-8 format. But when I want to convert them all into UTF-16 now. As the german language is having some characters like 1/4. Now I want to avoid those. I have tried with following the statement, but got some boxes in result string....

> select convert('Inhalt hinzufügen','AL16UTF16LE','AL32UTF8') from dual
  result : it is not allowing me to copy paste it :(. But result is coming properly except boxes in middle of each character

is there any alternative approach?

SELECT *
  FROM v$nls_parameters
 WHERE parameter LIKE '%CHARACTERSET';

indicates that my database character set is WE8MSWIN1252 while my national character set is AL32UTF16.

When I use the DUMP function to view the data that is actually stored in my table, this is the output:

SELECT dump( your_column, 1016 ), your_column
  FROM your_table
 WHERE some_key_column = <<value that gives you the row you're interested in>>

Typ=1 Len=54 CharacterSet=WE8MSWIN1252: 4d,c3,b6,63,68,74,65,6e,20,53,69,65,20,64,69,65,73,65,20,5a,65,69,6c,65,20,77,69‌​,72,6b,6c,69,63,68,20,65,6e,64,67,c3,bc,6c,74,69,67,20,6c,c3,b6,73,63,68,65,6e,3f​, Möchten Sie diese Zeile wirklich endgültig löschen?

Upvotes: 0

Views: 11753

Answers (1)

Justin Cave
Justin Cave

Reputation: 231651

Since your database character set is WE8MSWIN1252, your data is hopefully not actually stored as UTF-8. If the actual data is being stored in a CHAR, VARCHAR2, or CLOB column, the data is either stored using Windows-1252 character set or the data has been stored incorrectly. It is possible that you have configured your NLS environment incorrectly so that you're actually storing UTF-8 data in the database but hopefully not the case here.

Based on the output of the DUMP function, what character do you expect to be stored in the third position of the data? 0xB6 is the data that is actually stored in the database which maps to the paragraph symbol ¶ in the Windows-1252 character set. Assuming that is not the character that you expect, it would appear that the data that is stored in the database has been corrupted.

What language(s) is your data written in? Are all the characters that you want to store present in the Windows-1252 character set?

Are you trying to change how the data is stored? Or are you trying to retrieve the data in a different character set?

If the database character set is AL32UTF8, the national character set is AL32UTF16, and you want to store the data in the database using UTF-16, you'd need to move the data into a NVARCHAR2 or NCLOB column.

If you are trying to store the data in a UTF-8 format in the database but then send it to the client in UTF-16, that can be done automatically by configuring the client's NLS settings. Exactly how you do that will depend on how the client accesses the database (JDBC, ODBC, etc.).

Upvotes: 1

Related Questions