Reputation: 1090
I want to do an incremental DynamoDB backup on S3 using DynamoDB Streams. I have a lambda that reads the dynamodb stream and writes files into S3. In order to mark already read shards I have ExclusiveStartShardId logged into configuration file.
What I do is:
The problem here is that I read only closed shards and in order to get new records I must wait (undetermined-amount-of-time) for it to be closed.
It seems that the last shard is usually in OPEN state (has NO EndingSequenceNumber). If I remove the check for EndingSequenceNumber from the pseudo code above I end up with infinite loop because when I hit the last shard NextShardIterator is always presented. I cannot also do a check if fetched items are 0 because there could be "gaps" in the shard.
In this tutorial numChanges is used in order to stop the infinite loop http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.LowLevel.Walkthrough.html#Streams.LowLevel.Walkthrough.Step5
What is the best approach in this situation?
I also found a similar question: Reading data from dynamodb streams. Unfortunately I could not find the answer for my question.
Upvotes: 2
Views: 7705