Reputation: 8084
I have a DynamoDB Table and it is linked with one Stream, and that stream is linked with one lambda function which processed it.
Question - With above set up if an event comes to the stream and is ingested in Lambda, does that event still resides in that stream or it gets POPPED out as soon as it got ingested in Lambda just like a Queue?
Question 2 Can someone kindly tell me about the inner working of DDB Stream and how it passes the data to Lambda? Like are there any states for the stream events?
P.S: AWS Documentation says that events stay in stream for 24 hour window.
Upvotes: 0
Views: 681
Reputation: 173
There are two concepts to understand here
Whenever there is a change in the table like an addition, update or deletion, the Kinesis Stream feature of AWS stores that change for a period of 24 hrs. It does this through 4 methods:-
To associate a lambda function with your streams, a feature called Triggers are used. The changes invoke the Trigger which in-turn performs the lambda function associated with the change.
Part 1 of your question:-
By default, Lambda invokes your function as soon as records are available in the stream. If the batch it reads from the stream only has one record in it, Lambda only sends one record to the function. To avoid invoking the function with a small number of records, you can tell the event source to buffer records for up to 5 minutes by configuring a batch window. Before invoking the function, Lambda continues to read records from the stream until it has gathered a full batch, or until the batch window expires. If the Lambda fails it will try and process that message indefinitely (or until it expires), keeping other messages from being processed as a result. To avoid stalled shards(I'll talk about this later), you can configure the event source mapping to retry with smaller batch size, limit the number of retries, or discard records that are too old(you can set the age of the record that lambda can read).
Part 2 of your question:-
The streams which we are talking about are Kinesis Streams It is a feature to be used by multiple producers and consumers. Here the producer is DynamoDb and the consumer is lambda. Consumers have dedicated read throughput so they don't have to compete with other consumers of the same data. With consumers, Kinesis pushes records to Lambda over an HTTP/2 connection, which can also reduce latency between adding a record and function invocation. The capacity of the streams is determined by the number of shards it contains. Shards are small units of capacity in the Stream. Hence higher the shard value, higher the capacity.
I guess I have explained the working in the part1 of this answer. Feel free to ask follow up questions.
Upvotes: 2