Reputation: 736
There is a file which has special character as delimiter (§) utfCode -> 0xA7
Snapshot of file is below
"Diablo"§"tRaider"§"2019-08-12"
"GOT"§"BeltMorham"§"2019-01-02"
"Tomb Raider"§"RealMason"§"2019-04-02"
Now The fileformat is below
Create FILE FORMAT GamerFF
SET COMPRESSION = 'AUTO'
FIELD_DELIMITER = '§'
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE
ESCAPE = 'NONE'
ESCAPE_UNENCLOSED_FIELD = '\134'
DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N');
However when I try to read from the file.
select $1 from stage (file_format=>gamerFF).
Found character '\u00c2' instead of record delimiter '\n' File 'gngamer.txt', line 1, character 4 Row 1, column "TRANSIENT_STAGE_TABLE"["$1":1]
It complains for another character Â, utfcode -> \u00c2 which when I update the format with new delimiter works for first column.
But as soon as I try to read next column it throw the error. about error code (§)
select $1,$2 from stage (file_format=>gamerFF).
Invalid UTF8 detected in string '0xA7"1"' File 'gngamer.txt', line 1, character 5 Row 1, column "TRANSIENT_STAGE_TABLE"["$2":2]
using validate_utf=false does not help as it is bringing the special character in the field value.
Its look now I need two delimiter. I cannot change the input file.
Can any body please suggest
Upvotes: 1
Views: 1323
Reputation: 10144
You need to set FIELD_DELIMITER to '\xc2\xa7':
create or replace FILE FORMAT GamerFF COMPRESSION = 'AUTO'
FIELD_DELIMITER = '\xc2\xa7'
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE
ESCAPE = 'NONE'
ESCAPE_UNENCLOSED_FIELD = '\134'
DATE_FORMAT = 'AUTO' TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N');
Upvotes: 2