How to backup last 7 days modified/created data from DynamoDB to S3?

I came across AWS datapipeline template to backup data to S3. However, I do not want to backup the whole table. I just want to keep snapshot of changes that happened in the last 7 days.

I think the way to approach this is to have a GSI on my table on last_updated_date column to scan for records that changed. Now, Is it possible to use AWS Datapipelines to achieve the result?

Upvotes: 2

Answers (2)

Brian R Armstrong

Reputation: 410

What you are trying to do is very similar to the example provided for the HiveCopyActivity. The example copies data between two DynamoDB tables. You would need to make a couple changes:

Replace the output with an S3DataNode pointing to the bucket where you want to backups to be saved.

Change the filterSql to pull the last 7 days of data, something like:

"filterSql" : "last_updated_date > unix_timestamp(\"#{minusDays(@scheduledStartTime,7)}\", \"yyyy-MM-dd'T'HH:mm:ss\")"

Upvotes: 1

Michal Gasek

Reputation: 6413

Unless this is just one time task for you, I recommend utilising DynamoDB Streams and Kinesis or Lambda to backup changes to a durable storage. DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table, and stores this information in a log for up to 24 hours. You can trigger a Lambda function in combination with DynamoDB Streams and make it write changes to S3 and achieve near realtime, continuous backup.

Using GSI you can of course make lookups faster, but you will need a lot of provisioned throughput capacity for GSI and the table itself for a task processing a large table.

You can find relevant AWS documentation about Streams below:

1. Capturing Table Activity with DynamoDB Streams

2. Using the DynamoDB Streams Kinesis Adapter to Process Stream Records

There's also a nice blog post about it with examples:

DynamoDB Update – Triggers (Streams + Lambda) + Cross-Region Replication App

Hope this helps!

Upvotes: 2

How to backup last 7 days modified/created data from DynamoDB to S3?

Answers (2)

Related Questions