Reputation: 93
I'm trying to insert czech characters into my database from an external .csv.
If I print the unicode from the word "Sedláček" directly in a worksheet, I obtain "Sedl\00E1\010Dek". But if I print this same word imported from the .csv, I have "Sedl\00C3\00A1\00C4\008Dek" which gives in my database "SedláÄek".
Why is this different ? How can I have the right word from the .csv?
This is my code for the import :
set serveroutput on;
CREATE OR REPLACE DIRECTORY MYREP as '/foo';
DECLARE
f utl_file.file_type;
v_line NVARCHAR2(1000);
BEGIN
f := utl_file.fopen ('MYREP', 'toto.csv', 'R');
IF utl_file.is_open(f) THEN
LOOP
utl_file.get_line(f, v_line, 1000);
dbms_output.put_line(ASCIISTR(v_line));
END LOOP;
END IF;
utl_file.fclose(f);
END;
and in my worksheet :
set serveroutput on
BEGIN
dbms_output.put_line(ASCIISTR('Sedláček'));
END;
Many thanks for your attention!
Upvotes: 1
Views: 518
Reputation: 191275
From the documentation for UTL_FILE
:
UTL_FILE
expects that files opened byUTL_FILE.FOPEN
in text mode are encoded in the database character set. It expects that files opened byUTL_FILE.FOPEN_NCHAR
in text mode are encoded in the UTF8 character set. If an opened file is not encoded in the expected character set, the result of an attempt to read the file is indeterminate.
Your use of an nvarchar2
variable indicates you need to be using the national character set, so you need to use the FOPEN_NCHAR
function to open the file, and the FOPEN.GET_LINE_NCHAR
procedure to read from it.
BEGIN
f := utl_file.fopen_nchar('MYREP', 'toto.csv', 'R');
IF utl_file.is_open(f) THEN
LOOP
utl_file.get_line_nchar(f, v_line, 1000);
dbms_output.put_line(ASCIISTR(v_line));
END LOOP;
END IF;
utl_file.fclose(f);
END;
Upvotes: 3