Unbreakable
Unbreakable

Reputation: 8084

What happens to the events in Dyanamo DB Stream once its received by AWS Lambda

I have a DynamoDB Table and it is linked with one Stream, and that stream is linked with one lambda function which processed it.

Question - With above set up if an event comes to the stream and is ingested in Lambda, does that event still resides in that stream or it gets POPPED out as soon as it got ingested in Lambda just like a Queue?

Question 2 Can someone kindly tell me about the inner working of DDB Stream and how it passes the data to Lambda? Like are there any states for the stream events?

P.S: AWS Documentation says that events stay in stream for 24 hour window.

Upvotes: 0

Views: 681

Answers (1)

Phani Teja
Phani Teja

Reputation: 173

There are two concepts to understand here

  1. Streams
  2. Triggers

Whenever there is a change in the table like an addition, update or deletion, the Kinesis Stream feature of AWS stores that change for a period of 24 hrs. It does this through 4 methods:-

  • Keys only:- only the keys are stored after the change
  • New image:- The entire item on which the change is performed is stored
  • Old image:- When a change is performed on an item, the old item is stored instead of the new one
  • New and old:- self-explanatory

To associate a lambda function with your streams, a feature called Triggers are used. The changes invoke the Trigger which in-turn performs the lambda function associated with the change.

Part 1 of your question:-

By default, Lambda invokes your function as soon as records are available in the stream. If the batch it reads from the stream only has one record in it, Lambda only sends one record to the function. To avoid invoking the function with a small number of records, you can tell the event source to buffer records for up to 5 minutes by configuring a batch window. Before invoking the function, Lambda continues to read records from the stream until it has gathered a full batch, or until the batch window expires. If the Lambda fails it will try and process that message indefinitely (or until it expires), keeping other messages from being processed as a result. To avoid stalled shards(I'll talk about this later), you can configure the event source mapping to retry with smaller batch size, limit the number of retries, or discard records that are too old(you can set the age of the record that lambda can read).

Part 2 of your question:-

The streams which we are talking about are Kinesis Streams It is a feature to be used by multiple producers and consumers. Here the producer is DynamoDb and the consumer is lambda. Consumers have dedicated read throughput so they don't have to compete with other consumers of the same data. With consumers, Kinesis pushes records to Lambda over an HTTP/2 connection, which can also reduce latency between adding a record and function invocation. The capacity of the streams is determined by the number of shards it contains. Shards are small units of capacity in the Stream. Hence higher the shard value, higher the capacity.

I guess I have explained the working in the part1 of this answer. Feel free to ask follow up questions.

Upvotes: 2

Related Questions