Reputation: 332
I have many files to load in S3. And I have created manifest file at each prefix of the files.
for instance, at s3://my-bucket/unit_1 I have files like below.
chunk1.csv.gz
chunk2.csv.gz
chunk3.csv.gz
cunkk4.csv.gz
unit.manifest
so with copy command, I can load the unit_1
files to redshift
However, I got more than 1000 units so I want to do it with loop. So I want to make loop that iterate from 1 to 1000 to change just prefix of the manifest file.
So I did like below,
create or replace procedure copy_loop()
language plpgsql
as $$
BEGIN
FOR i in 1..1000 LOOP
COPY mytable
FROM 's3://my-bucket/unit_%/unit.manifest', i
credentials 'aws_iam_role=arn:aws:iam::myrolearn'
MANIFEST
REGION 'ap-northeast-2'
REMOVEQUOTES
IGNOREHEADER 1
ESCAPE
DATEFORMAT 'auto'
TIMEFORMAT 'auto'
GZIP
DELIMITER '|'
ACCEPTINVCHARS '?'
COMPUPDATE FALSE
STATUPDATE FALSE
MAXERROR 0
BLANKSASNULL
EMPTYASNULL
NULL AS '\N'
EXPLICIT_IDS;
END LOOP;
END;
$$;
But I got this message
SQL Error [500310] [42601]: Amazon Invalid operation: syntax error at or near ",";
How can I handle this?
Upvotes: 1
Views: 780
Reputation: 332
This is my solution.
create or replace procedure copy_loop(i1 int, i2 int)
language plpgsql
as $$
DECLARE
prefix TEXT := 's3://mybucket/unit_';
manifest TEXT := '/unit.manifest' ;
manifest_location TEXT ;
copy_commands VARCHAR(2000) ;
copy_options VARCHAR(2000) := 'credentials '|| quote_literal('aws_iam_role=myrolearn')
|| ' MANIFEST '
|| ' REGION ' || quote_literal('ap-northeast-2')
|| ' REMOVEQUOTES '
|| ' IGNOREHEADER 1 '
|| ' ESCAPE '
|| ' DATEFORMAT ' || quote_literal('auto')
|| ' TIMEFORMAT ' || quote_literal('auto')
|| ' GZIP '
|| ' DELIMITER ' || quote_literal('|')
|| ' ACCEPTINVCHARS ' || quote_literal('?')
|| ' COMPUPDATE FALSE '
|| ' STATUPDATE FALSE '
|| ' MAXERROR 0 '
|| ' BLANKSASNULL '
|| ' EMPTYASNULL '
|| ' NULL AS ' || quote_literal('\N')
|| ' EXPLICIT_IDS ';
BEGIN
FOR i in i1..i2 LOOP
manifest_location := prefix || i || manifest;
copy_commands := 'COPY mytable FROM' || quote_literal(manifest_location) || copy_options;
execute copy_commands;
END LOOP;
END;
$$;
using this procedure, I could copy files from more than 1000 units. also set starting number and end number of the loop helped to divide the loading jobs. Since large amount loading takes few hours, I think it is better to do load job with some chunks.
Upvotes: 1