Reputation: 111
Has anybody worked on creating a python script to load data from s3 to redshift tables for multiple files. How can we acheive it in AWS CLI. Your learnings and inputs on the same is appreciated.
Upvotes: 1
Views: 2497
Reputation: 269081
The COPY
command is the best way to load data from Amazon S3 to Amazon Redshift. It can load multiple files in parallel into the one table.
Use any Python library (eg PostgreSQL + Python | Psycopg) to connect to Amazon Redshift, then issue the COPY
command.
The AWS Command-Line Interface (CLI) does not have the ability to run the COPY
command on Redshift because it needs to be issued to the database, while the AWS CLI issues commands to AWS. (The AWS CLI can be used to launch/terminate a Redshift cluster, but not to connect to the cluster itself.)
Upvotes: 2