Oracle : Insert czech characters from external source

Question

I'm trying to insert czech characters into my database from an external .csv.

If I print the unicode from the word "Sedláček" directly in a worksheet, I obtain "Sedl\00E1\010Dek". But if I print this same word imported from the .csv, I have "Sedl\00C3\00A1\00C4\008Dek" which gives in my database "SedlÃ¡Äek".

Why is this different ? How can I have the right word from the .csv?

This is my code for the import :

set serveroutput on;

CREATE OR REPLACE DIRECTORY MYREP as '/foo';

DECLARE
    f utl_file.file_type;
    v_line NVARCHAR2(1000);

BEGIN
f := utl_file.fopen ('MYREP', 'toto.csv', 'R');
    IF utl_file.is_open(f) THEN
      LOOP
          utl_file.get_line(f, v_line, 1000);
          dbms_output.put_line(ASCIISTR(v_line));
      END LOOP;
    END IF;
utl_file.fclose(f);
END;

and in my worksheet :

set serveroutput on

BEGIN
   dbms_output.put_line(ASCIISTR('Sedláček')); 
END;

Many thanks for your attention!

Alex Poole · Accepted Answer

From the documentation for UTL_FILE:

UTL_FILE expects that files opened by UTL_FILE.FOPEN in text mode are encoded in the database character set. It expects that files opened by UTL_FILE.FOPEN_NCHAR in text mode are encoded in the UTF8 character set. If an opened file is not encoded in the expected character set, the result of an attempt to read the file is indeterminate.

Your use of an nvarchar2 variable indicates you need to be using the national character set, so you need to use the FOPEN_NCHAR function to open the file, and the FOPEN.GET_LINE_NCHAR procedure to read from it.

BEGIN
    f := utl_file.fopen_nchar('MYREP', 'toto.csv', 'R');
    IF utl_file.is_open(f) THEN
      LOOP
          utl_file.get_line_nchar(f, v_line, 1000);
          dbms_output.put_line(ASCIISTR(v_line));
      END LOOP;
    END IF;
    utl_file.fclose(f);
END;

Oracle : Insert czech characters from external source

Answers (1)

Related Questions