user1728464
user1728464

Reputation: 11

issue with using csvread in H2 database with unicode characters as delimiter

I am trying to use csvread to read from the testfile attached. The file has CTRL-A (\u0001) as delimiter in between fields.

The statement I used is

select * from csvread('test.csv','id, name','charset=UTF-8 fieldDelimiter=\u0001');

Expected Output:

ID | Name
12 | sandeep

Actual output:

ID | Name
12\u0001sandeep | null

which means it is not picking \u0001 as the delimiter.

How can I handle CTRL-A(\u0001) as delimiter while doing csvread of a file?

Upvotes: 1

Views: 1476

Answers (1)

Thomas Mueller
Thomas Mueller

Reputation: 50087

You could use:

select * from csvread('~/Downloads/test.csv',
stringdecode('id\u0001name'),
stringdecode('charset=UTF-8 fieldSeparator=\u0001'));

You need to use stringdecode because ANSI-SQL text literals do not support Java escape sequences such as \u0001. Then, you have used fieldDelimiter instead of fieldSeparator (see the docs).

Upvotes: 1

Related Questions