KennethJ
KennethJ

Reputation: 974

Increased Kinesis latency resulting in low gets and high delays via Lambda

We're using Kinesis as a buffer for Lambda, which then inserts into Redshift. The Lambda function creates a file in S3 and does a COPY in Redshift to insert the data. We're seeing very high delays in data coming out Kinesis and we're worried this is resulting in data older than 24 hours being dropped. We currently have 3 shards running, and are no where near our maximum throughput.

In the same space of time we've also seen an increase in the amount of data going into Kinesis. However, as we are only using about a third of our write throughput, we shouldn't be throttled. There are no fluctuations in any of the Lambda or Redshift metrics.

The attached files show the stats from our Kinesis stream. What could be causing this to happen, and how would I go about fixing it?

Kinesis get requests

Kinesis get latency

enter image description here

enter image description here

Upvotes: 3

Views: 2522

Answers (1)

Ryan Gross
Ryan Gross

Reputation: 6515

Most likely what is happening is that your lambda function is not keeping up with the data rate coming into Kinesis. The way lambda functions with Kinesis event streams work, there is only one (single core) lambda function attached to each shard. So you are only getting 3 functions.

You can see if the function is falling behind by looking at the iteratorAgeMilliseconds metric on Kinesis. This, coupled with a look at the average execution duration on your lambda function and the lambda event source batch size, should give you a good idea of how much data your lambda function is actually processing per second. (Event source batch size) * (average size of each record) / (average duration of lambda invocation) * (number of shards) = total bytes/second processed. You can use this to determine how many shards of Kinesis you need to keep up with the load.

Also, you may want to look into a "fan out" setup, wherein you have one lambda function reading events off of the stream and then directly invoking another lambda function with the events. This gets you away from the shard-affinity in lambda.

Upvotes: 3

Related Questions