Simon1979
Simon1979

Reputation: 2108

Manipulating Data Within AWS Redshift to a Schedule

Current Setup:

Our customers only have access to the OLAP Db

Requirement:

One customer requires some additional tables to be created and populated to a schedule which can be done by aggregating the data already in AWS Redshift.

Challenge:

This is only for one customer so I cannot leverage the core process for populating AWS; the process must be independent and is to be handed over to the customer who do not use SSIS and don't wish to start. I was considering using Data Pipeline but this is not yet available in the market in which the customer resides.

Question:

What is my alternative? I am aware of numerous partners who offer ETL like solutions but this seems over the top, ultimately all I want to do is execute a series of SQL statements on a schedule with some form of error handling/ alert. Preference of both customer and management is to not use a bespoke app to do this, hence the intended use of Data Pipeline.

Upvotes: 2

Views: 703

Answers (2)

Rahul Gupta
Rahul Gupta

Reputation: 1802

I do exactly the same thing as you, but I use lambda service to perform my ETL. One drawback of lambda service is, it can run max of 5 mins (Initially 1 min) only.

So for ETL's greater than 5 minutes, I am planning to set up PHP server in AWS and with SQL injection I can run my queries, scheduled at any time with help of cron function.

Upvotes: 1

user667914
user667914

Reputation: 46

For exporting data from AWS Redshift to another data source using datapipeline you can follow a template similar to https://github.com/awslabs/data-pipeline-samples/tree/master/samples/RedshiftToRDS using which data can be transferred from Redshift to RDS. But instead of using RDSDatabase as the sink you could add a JdbcDatabase (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-jdbcdatabase.html). The template https://github.com/awslabs/data-pipeline-samples/blob/master/samples/oracle-backup/definition.json provides more details on how to use the JdbcDatabase.

There are many such templates available in https://github.com/awslabs/data-pipeline-samples/tree/master/samples to use as a reference.

Upvotes: 1

Related Questions