How to export data from AWS Aurora Postgres DB to Redshift?

Question

I have a Postgres DB hosted on AWS Aurora from which I need to retrieve data and insert it into Redshift.

My current approach is as follows:

Create an Aurora DB connection using Psycopg2.
With Aurora connection created above, query the Aurora DB table and export the resultset as a CSV file to S3 using OUTFILE
From S3, Redshift connection using Psycopg2.

I'm trying to optimize this by removing the S3 service and connecting Aurora to Redshift directly.

Here's what I want to do for which I couldn't find resources:

Query the Aurora table - table1 and directly export the result set into the Redshift table - table1.

I'm not even sure if this is possible with the current system. Any thoughts?

John Rotenstein · Accepted Answer

There are two ways to get data into an Amazon Redshift database:

COPY command to load from Amazon S3
INSERT statement to insert data provided as part of the SQL statement

The COPY method is recommended for normal data loading. It runs in parallel across slices and stores the data as efficiently as possible given that it is appending data.

The INSERT command is acceptable for a small number of inserts, but not a good idea for inserting lots of rows. Where possible, insert multiple rows at a time. It is acceptable to use INSERT ... SELECT statements, which can insert bulk data from a different table in one operation.

So, the only way to remove Amazon S3 from your operation is to code the data into an INSERT statement, but this is not an optimal way to load the data.

How to export data from AWS Aurora Postgres DB to Redshift?

Answers (1)

Related Questions