Reputation: 11788
I have pipe delimited data which I am trying to copy from S3 to RedShift but COPY operation is failing with error code 1202 : Extra column(s) found
error.
When I looked into stl_load_errors
, copy operation failed for rows which have | delimiter in them.
Sample data:
1|hello world|how|are you|
2|"hope|you|are|doing|good"|thank you|
3|I am fine|thank you|
For above mentioned data, 2|"hope|you|are|doing|good"|thank you|
fails to get copied because it has | delimiter in itself even though it is inside double quotes.
My copy command looks like below:
COPY <DATABASE.TABLE NAME>
FROM 's3://path/to/file'
iam_role 'arn:aws:iam:my_role'
delimiter '|'
dateformat 'auto'
IGNOREHEADER 1
MAXERROR 5;
AWS RedShift documentation has one example to load this type of data but they have used CSV option, not DELIMITER option.
How can I solve this issue?
Upvotes: 1
Views: 1917
Reputation: 269101
You should add the REMOVEQUOTES
parameter.
From Data Conversion Parameters - Amazon Redshift:
Removes surrounding quotation marks from strings in the incoming data. All characters within the quotation marks, including delimiters, are retained.
Upvotes: 5