Reputation: 37
I'm trying to import data into Snowflake using the copy command. I have a file format defined as follows:
CREATE FILE FORMAT mydb.schema1.myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\042'
TRIM_SPACE = FALSE
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\241'
ESCAPE_UNENCLOSED_FIELD = NONE
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
COMMENT = '¡ used as escape character';
There's nothing special about the file format, except it's using ¡ as an escape character.
When importing data with this file format, it seems Snowflake is not recognizing the escape character, and it's throwing an error saying "Found character 'XYZ' instead of field delimiter ','".
I tried creating a file with 1 line, like the following:
"ABC123","584382","2","01","01/22/2019","02/08/2019","02/08/2019","04/03/2019","04/03/2019","TEST","Unknown","Unknown","01-884400","Unknown","DACRON CONNECTIONS 15¡"1/2 DIA. X 11¡" LONG FOR EXHAUST DAMPER","","0.0","0.0","0.0","0.0","192.0","USD","2.0","2.0","0","0","96.00000","1","","","","","07882-0047","ASDF","ASDF","02/27/2019","04/06/2021","01/01/1970","0"
This file fails on line 1, char 167, which is right after the first escape character (before the 1 in the following text: CONNECTIONS 15¡"1/2)
Any idea why this is happening?
This is the code I'm running to do the copy
copy into mydb.schema1.mytable from @mydb.schema1.mystage/file-path/2021-05-26/test.txt
file_format = mydb.schema1.myFileFormat
validation_mode = 'return_all_errors';
Upvotes: 1
Views: 742
Reputation: 6229
Short Answer Looks like Snowflake only allows single-byte characters to be used as an escape character for a file format. The character you're using as the escape character uses two bytes and therefore isn't allowed as an escape character by the file format.
You can however use multi-byte characters for field and row delimiters so not sure why Snowflake hasn't allowed it as the escape character as well.
Longer Answer
The character you're trying to use as the escape character (¡) is two bytes long with a hex value of \xC2\xA1
. This isn't allowed as you can see by the following error:
CREATE OR REPLACE FILE FORMAT myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\x22' -- Double quotes (")
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\xC2\xA1' -- Inverted exclamation point (¡)
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
invalid value ['\xC2\xA1'] for parameter 'ESCAPE'
On the other hand, if I use the last single-byte character I could possibly use (and is visible), the tilde (~), with a hex value of \x7E
(you'd think it should be \xFF
but utf-8 uses 7 bits before it goes into 2 bytes. Long story.) then it works fine. I tested this with a file and copy command and it works without issue.
CREATE OR REPLACE FILE FORMAT myFileFormat
TYPE = CSV
COMPRESSION = 'AUTO'
FIELD_DELIMITER = ','
RECORD_DELIMITER = '\n'
SKIP_HEADER = 0
FIELD_OPTIONALLY_ENCLOSED_BY = '\x22' -- Double quotes (")
ERROR_ON_COLUMN_COUNT_MISMATCH = FALSE
ESCAPE = '\x7E' -- Tilde (~)
DATE_FORMAT = 'AUTO'
TIMESTAMP_FORMAT = 'AUTO'
NULL_IF = ('\\N')
[2021-05-26 23:49:21] completed in 149 ms
Upvotes: 2