Frollo
Frollo

Reputation: 29

How to upload data via SQL to Amazon Redshift?

I created a cluster and connected to the database via SQL Workbench, but how can I upload data via SQL to Amazon Redshift?

I guess I have to use Amazon S3 but I could not find a sample video or text that describes it well.

Upvotes: 0

Views: 854

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 270184

There are two ways to insert information into Amazon Redshift:

  • Via the COPY command
  • Via INSERT statements

It is not recommended to use INSERT statements because they are not efficient for large data volumes. They are okay for doing ETL-type processes such as copying data between tables, but as a general rule data should be loaded via COPY.

As per Using a COPY Command to Load Data, the COPY command can load data from:

  • Amazon S3 (recommended, highly parallel)
  • Amazon EMR (Hadoop)
  • Amazon DynamoDB
  • Via SSH from remote hosts

The load from Amazon S3 is performed in parallel across all nodes and is the most efficient way to load data.

The Amazon Redshift COPY command can read several file formats:

  • Delimited (eg CSV)
  • Fixed-Width
  • AVRO
  • JSON
  • And these formats can also be compressed (eg gzip)

Bottom line: Get your data into Amazon S3 in a compatible format, then use COPY to load it.

Also, try to understand DISTKEY and SORTKEY to get full performance benefits out of Redshift. Definitely read the manual -- it will save you more time than it takes to read!

Upvotes: 1

Related Questions