itwentviral
itwentviral

Reputation: 31

Building temporary table from S3 file to Redshift

I'm running into consistent errors trying to import a single column of data (all numerical) from S3 into Redshift in the form of a temporary table. Here is my query:

CREATE TEMP TABLE tmp_userid(userid bigint);
COPY tmp_userid (userid) FROM 's3://name/recent_prem_idsonly.txt'
CREDENTIALS 'aws_access_key_id=XXX;aws_secret_access_key=XXX';

which results in:

ERROR: Load into table 'tmp_userid' failed.  Check 'stl_load_errors' 
system table for details.  SQL state: XX000`

which then traced shows

Error Code 1207: Invalid digit, Value '8', Pos 7, Type: Long

I noticed that futzing with the data type for userid will drive different errors, so perhaps there's something off there? I've experimented with saving the file as csv and txt.

Upvotes: 3

Views: 8834

Answers (4)

rpavi
rpavi

Reputation: 1

You can also CREATE a TEMP table with the same schema as the S3 file and then use the COPY command to push the data to that TEMP table

CREATE TEMP TABLE test_table 
(
  userid          VARCHAR(10)
);
    

COPY test_table (userid) FROM 's3://name/recent_prem_idsonly.txt'
CREDENTIALS 'aws_access_key_id=XXX;aws_secret_access_key=XXX';

Upvotes: 0

itwentviral
itwentviral

Reputation: 31

I realized that the issue was driven by the csv featuring more [empty] columns than Redshift liked. Removing them caused the temp table to be created properly.

Upvotes: 0

Dolfa
Dolfa

Reputation: 792

Do you have by any chance quotes around values in txt/csv file? If yes, remove them, or add 'removequotes' to copy command. (I received same error with quotes in csv file.)

This should work even with temp table, I am doing this all the time (load data to temp table with copy, join them with existing table in Redshift, update/insert to permanent table in Redshift)

Upvotes: 0

Joe Harris
Joe Harris

Reputation: 14035

Temp tables only exist within a single connection. Redshift COPY loads data using an 'out of band' connection that cannot see your temp table. Just use a real table with a "_tmp" suffix and drop it when you're done.

Upvotes: -1

Related Questions