Shekhar
Shekhar

Reputation: 11788

How to copy data in Redshift using DELIMITER option which has delimiter in the data itself?

I have pipe delimited data which I am trying to copy from S3 to RedShift but COPY operation is failing with error code 1202 : Extra column(s) found error.

When I looked into stl_load_errors, copy operation failed for rows which have | delimiter in them.

Sample data:

1|hello world|how|are you|
2|"hope|you|are|doing|good"|thank you|
3|I am fine|thank you|

For above mentioned data, 2|"hope|you|are|doing|good"|thank you| fails to get copied because it has | delimiter in itself even though it is inside double quotes.

My copy command looks like below:

COPY <DATABASE.TABLE NAME>
FROM 's3://path/to/file'
iam_role 'arn:aws:iam:my_role'
delimiter '|'
dateformat 'auto'
IGNOREHEADER 1
MAXERROR 5;

AWS RedShift documentation has one example to load this type of data but they have used CSV option, not DELIMITER option.

How can I solve this issue?

Upvotes: 1

Views: 1917

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 269101

You should add the REMOVEQUOTES parameter.

From Data Conversion Parameters - Amazon Redshift:

Removes surrounding quotation marks from strings in the incoming data. All characters within the quotation marks, including delimiters, are retained.

Upvotes: 5

Related Questions