AJ Venturella
AJ Venturella

Reputation: 4912

Kinesis Analytics Destination Guidance: Lambda vs Kinesis Stream to Lambda

After Kinesis Analytics does it's job, the next step is to send that information off to a destination. AWS currently offers 3 destination choices:

For my use case, Kinesis Firehose delivery stream is not what I want so I am left with:

If I set the destination to a Kinesis Stream, I would then attach a Lambda to that stream to process the records.

AWS also offers the ability to set the destination to a Lambda, bypassing the Kinesis Stream step of this process. In doing some digging for docs I found this:

Using a Lambda Function as Output

Specifically in those docs under Lambda Output Invocation Frequency it says:

If records are emitted to the destination in-application stream within the data analytics application as a continuous query or a sliding window, the AWS Lambda destination function is invoked approximately once per second.

My Kinesis Analytics output qualifies under this scenario. So I can assume that my Lambda will be invoked, "approximately once per second".

I'm trying to understand the difference between using these 2 destinations as it pertains to using a Lambda.

Using AWS Lambda with Kinesis states that:

You can subscribe Lambda functions to automatically read batches of records off your Kinesis stream and process them if records are detected on the stream. AWS Lambda then polls the stream periodically (once per second) for new records.

So it sounds like the the invocation interval is the same in either case; approximately 1 second.

So I think the guidence is:

If the next stage in the pipeline only needs one consumer, then use the AWS Lambda function destination. If however, you need to use multiple different consumers to do different things for the same data sent to the destination, the a Kinesis Stream is more appropriate.

Is this a correct assumption on how to choose a destination? Again, for my use case I am excluding the Kinesis Firehose delivery stream.

Upvotes: 3

Views: 1016

Answers (1)

Costin
Costin

Reputation: 3029

If the next stage in the pipeline only needs one consumer, then use the AWS Lambda function destination. If however, you need to use multiple different consumers to do different things for the same data sent to the destination, the a Kinesis Stream is more appropriate.

• I would always use Kinesis Stream with one shard and batch size = 1 (for example) if I wanted the items to be consumed one by one with no concurrency.

If there are multiple consumers, increase the number of shards, one lambda is launched in parallel for each shard when there are items to process. If it makes sense, also increase the batch size.

But read again at the highlighted phrase below:

If however, you need to use multiple different consumers to do different things for the same data sent to the destination, the a Kinesis Stream is more appropriate.

If you have one or more producers and many consumers of the exactly same item, I guess you need to use SNS. The producer writes the item on one topic, then all the lambdas listening to the topic will process that item.

If this does not answer your question, please clarify it. There is a little ambiguity.

Upvotes: 1

Related Questions