danD
danD

Reputation: 736

Snowflake csv delimitter issue due to special character

There is a file which has special character as delimiter (§) utfCode -> 0xA7

Snapshot of file is below

"Diablo"§"tRaider"§"2019-08-12"
"GOT"§"BeltMorham"§"2019-01-02"
"Tomb Raider"§"RealMason"§"2019-04-02"

Now The fileformat is below

Create FILE FORMAT GamerFF
SET COMPRESSION = 'AUTO' 
FIELD_DELIMITER = '§'
RECORD_DELIMITER = '\n' 
SKIP_HEADER = 0 
FIELD_OPTIONALLY_ENCLOSED_BY = '\042' 
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE 
ESCAPE = 'NONE' 
ESCAPE_UNENCLOSED_FIELD = '\134' 
DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO' 
NULL_IF = ('\\N');

However when I try to read from the file.

select $1 from stage (file_format=>gamerFF).
Found character '\u00c2' instead of record delimiter '\n' File 'gngamer.txt', line 1, character 4 Row 1, column "TRANSIENT_STAGE_TABLE"["$1":1]

It complains for another character Â, utfcode -> \u00c2 which when I update the format with new delimiter works for first column.

But as soon as I try to read next column it throw the error. about error code (§)

select $1,$2 from stage (file_format=>gamerFF).
Invalid UTF8 detected in string '0xA7"1"' File 'gngamer.txt', line 1, character 5 Row 1, column "TRANSIENT_STAGE_TABLE"["$2":2]

using validate_utf=false does not help as it is bringing the special character in the field value.

Its look now I need two delimiter. I cannot change the input file.

Can any body please suggest

Upvotes: 1

Views: 1323

Answers (1)

Gokhan Atil
Gokhan Atil

Reputation: 10144

You need to set FIELD_DELIMITER to '\xc2\xa7':

create or replace FILE FORMAT GamerFF COMPRESSION = 'AUTO' 
FIELD_DELIMITER = '\xc2\xa7'
RECORD_DELIMITER = '\n' 
SKIP_HEADER = 0 
FIELD_OPTIONALLY_ENCLOSED_BY = '\042' 
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE 
ESCAPE = 'NONE' 
ESCAPE_UNENCLOSED_FIELD = '\134' 
DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO' 
NULL_IF = ('\\N');

Upvotes: 2

Related Questions