Reputation: 1321
I'm storing some events into DynamoDB. I have to sync (i.e. copy incrementally) the data with Redshift. Ultimately, I want to be able to analyze the data through AWS Quicksight.
I've come across multiple solutions but those are either one-time (using the one-time COPY command) or real-time (a streaming data pipeline using Kinesis Firehose).
The real-time solution seems superior to hourly sync, but I'm worried about performance and complexity. I was wondering if there's an easier way to batch the updates on an hourly basis.
Upvotes: 0
Views: 906
Reputation: 1624
What you are looking for are DynamoDB Streams (official docs). This can seamlessly flow into the Kinesis firehose as you have correctly pointed out.
This is the most optimal way and provides the best balance between the cost, operational overhead and the functionality itself. Allow me explain how:
COPY
command to redshift. Same goes for the size of the data in the stream buffer. Read more about it here.The ideal way to load data into Redshift is via COPY
command and Kinesis Firehose does just that. You can also configure it to automatically create backup of the data into S3.
Remember that a reactive or push based system is almost always more performant and less costly than a reactive or push based system. You save on the compute capacity needed to run a cron process and also continuously scan for updates.
Upvotes: 2