Hemant
Hemant

Reputation: 19826

How to save unicode data to oracle?

I am trying to save unicode data (greek) in oracle database (10 g). I have created a simple table:

alt text

I understand that NVARCHAR2 always uses UTF-16 encoding so it must be fine for all (human) languages.

Then I am trying to insert a string in database. I have hardcoded the string ("How are you?" in Greek) in code. Then I try to get it back from database and show it.

class Program
{
    static string connectionString = "<my connection string>";

    static void Main (string[] args) {
        string textBefore = "Τι κάνεις;";

        DeleteAll ();
        SaveToDatabase (textBefore);
        string textAfter = GetFromDatabase ();

        string beforeData = String.Format ("Before: {0}, ({1})", textBefore, ToHex (textBefore));
        string afterData = String.Format ("After: {0}, ({1})", textAfter, ToHex (textAfter));

        Console.WriteLine (beforeData);
        Console.WriteLine (afterData);

        MessageBox.Show (beforeData);
        MessageBox.Show (afterData);

        Console.ReadLine ();
    }

    static void DeleteAll () {
        using (var oraConnection = new OracleConnection (connectionString)) {
            oraConnection.Open ();
            var command = oraConnection.CreateCommand ();

            command.CommandText = "delete from UNICODEDATA";
            command.ExecuteNonQuery ();
        }            
    }

    static void SaveToDatabase (string stringToSave) {
        using (var oraConnection = new OracleConnection (connectionString)) {
            oraConnection.Open ();
            var command = oraConnection.CreateCommand ();

            command.CommandText = "INSERT into UNICODEDATA (ID, UNICODESTRING) Values (11, :UnicodeString)";
            command.Parameters.Add (":UnicodeString", stringToSave);
            command.ExecuteNonQuery ();
        }
    }

    static string GetFromDatabase () {
        using (var oraConnection = new OracleConnection (connectionString)) {
            oraConnection.Open ();

            var command = oraConnection.CreateCommand ();
            command.CommandText = "Select * from UNICODEDATA";
            var erpReader = command.ExecuteReader ();

            string s = String.Empty;
            while (erpReader.Read ()) {
                string text = erpReader.GetString (1);
                s += text + ", ";
            }

            return s;
        }
    }

    static string ToHex (string input) {
        string bytes = String.Empty;
        foreach (var c in input)
            bytes += ((int)c).ToString ("X4") + " ";

        return bytes;
    }
}

Here are different outputs:

Text before sending to database in a message box: alt text

Text after getting from database in a message box: alt text

Console Output: alt text

Please can you suggest what I might be doing wrong here?

Upvotes: 8

Views: 18878

Answers (6)

Veljac
Veljac

Reputation: 1190

Solution: set NLS_LANG!

Details: I just had the same problem, and actually had exact the same situation as described in Sergey Bazarnik's investigation. Using bind variables it works, and without them it doesn't.

The SOLUTION is to set NLS_LANG in proper place. Since I have Windows server I set it in windows registry under HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ORACLE\KEY_OraClient11g_home1

Please note that regitry location may difer so the easiest way is to search registry for "ORACLE_HOME" string. Also other systems like Linux, Unix can set this on different way (export NLS_LANG ...)

In my case I put "NLS_LANG"="CROATIAN_CROATIA.UTF8". Since I had no that variable set it went to default value. After changing registry you should restart process. In my case I restarted IIS.

Regarding reason why it works with bind variables may be because it actually happens on server side, while without it actually happens on client side. So even that DB can insert proper values - before that happens, client does the unwanted corrections, since it thinks that is should do that. That is because NLS_LANG defaults to simpler code page. But instead of doing useful task, that creates a problem, which (as shown in investigation looks hard to understand).

In case you have multiple oracle versions, be sure to correct all versions in registry (in my case Oracle 10 had valid setting, but Oracle 11 had no NLS_LANG set at all).

Upvotes: 0

Sergey Bazarnik
Sergey Bazarnik

Reputation: 56

After some investigations here we go:

string input = "•"; char s = input[0];

       //table kuuku with column kuku(nvarchar2(100))
        string connString = "your connection";

        //CLEAN TABLE
        using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString))
        {
            cn.Open();
            System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("delete from  kuku ", cn);
            cmd.ExecuteNonQuery();
            cn.Close();
        }


        //INSERT WITH PARAMETER BINDING - UNICODE SAVED
        using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString))
        {
            cn.Open();
            System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("insert into  kuku (kuku) values(:UnicodeString)", cn);
            cmd.Parameters.Add(":UnicodeString", System.Data.OracleClient.OracleType.NVarChar).Value = input + " OK" ;
            cmd.ExecuteNonQuery();
            cn.Close();
        }

        //INSERT WITHOUT PARAMETER BINDING - UNICODE NOT SAVED
        using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString))
        {
            cn.Open();
            System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("insert into  kuku (kuku) values('" +input+" WRONG')", cn);
            cmd.ExecuteNonQuery();
            cn.Close();
        }
        //FETCH RESULT
        using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString))
        {
            cn.Open();
            System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("select kuku from kuku", cn);
            System.Data.OracleClient.OracleDataReader dr = cmd.ExecuteReader();
            if(dr.Read())
            {
                string output = (string) dr[0];
                char sa = output[0];
            }
            cn.Close();
        }
    }

PL SQL look

Upvotes: 3

TenB
TenB

Reputation: 21

On reading records, try

Encoding utf = Encoding.Default;   
var utfBytes = odatareader.GetOracleString(0).GetNonUnicodeBytes();//OracleDataReader
Console.WriteLine(utf.GetString(utfBytes));

Upvotes: 0

Noam
Noam

Reputation: 5266

One more thing worth noting.

If you are using oracle client, and would like to include unicode characters in the CommandText, you should add the folloing line to the start of your application:

System.Environment.SetEnvironmentVariable("ORA_NCHAR_LITERAL_REPLACE", "TRUE");

This will allow you, in case you need it, to use the following syntax:

command.CommandText = "INSERT into UNICODEDATA (ID, UNICODESTRING) Values (11, N'Τι κάνεις;')";

Upvotes: 1

Jon Skeet
Jon Skeet

Reputation: 1499790

I can see five potential areas for problems:

  1. How are you actually getting the text into your .NET application? If it's hardcoded in a string literal, are you sure that the compiler is assuming the right encoding for your source file?

  2. There could be a problem in how you're sending it to the database.

  3. There could be a problem with how it's being stored in the database.

  4. There could be a problem with how you're fetching it in the database.

  5. There could be a problem with how you're displaying it again afterwards.

Now areas 2-4 sound like they're less likely to be an issue than 1 and 5. How are you displaying the text afterwards? Are you actually fetching it out of the database in .NET, or are you using Toad or something similar to try to see it?

If you're writing it out again from .NET, I suggest you skip the database entirely - if you just display the string itself, what do you see?

I have an article you might find useful on debugging Unicode problems. In particular, concentrate on every place where the encoding could be going wrong, and make sure that whenever you "display" a string you dump out the exact Unicode characters (as integers) so you can check those rather than just whatever your current font wants to display.

EDIT: Okay, so the database is involved somewhere in the problem.

I strongly suggest that you remove anything like ASP and HTML out of the equation. Write a simple console app that does nothing but insert the string and fetch it again. Make it dump the individual Unicode characters (as integers) before and after. Then try to see what's in the database (e.g. using Toad). I don't know the Oracle functions to convert strings into sequences of individual Unicode characters and then convert those characters into integers, but that would quite possibly be the next thing I'd try.

EDIT: Two more suggestions (good to see the console app, btw).

  1. Specify the data type for the parameter, instead of just giving it an object. For instance:

    command.Parameters.Add (":UnicodeString",
                            OracleType.NVarChar).Value = stringToSave;
    
  2. Consider using Oracle's own driver instead of the one built into .NET. You may wish to do this anyway, as it's generally reckoned to be faster and more reliable, I believe.

Upvotes: 6

Vincent Malgrat
Vincent Malgrat

Reputation: 67722

You can determine what characterset your database uses for NCHAR with the query:

SQL> SELECT VALUE
  2    FROM nls_database_parameters
  3   WHERE parameter = 'NLS_NCHAR_CHARACTERSET';

VALUE
------------
AL16UTF16

to check if your database configuration is correct, you could run the following in SQL*Plus:

SQL> CREATE TABLE unicodedata (ID NUMBER, unicodestring NVARCHAR2(100)); 

Table created
SQL> INSERT INTO unicodedata VALUES (11, 'Τι κάνεις;');

1 row inserted
SQL> SELECT * FROM unicodedata;

        ID UNICODESTRING
---------- ---------------------------------------------------
        11 Τι κάνεις;

Upvotes: 2

Related Questions