Reputation: 2352
How come I find so little examples of the KCL being used with AWS Lambda. https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-kcl.html
It does provide a fine implementation for keeping track of your position on the stream (checkpointing).
I want to use the KCL as a consumer. My set-up is a stream with multiple shards. On each shard a Lambda is consuming. I want to use the KCL in the Lambda's to track the position of the iterator on the shard.
Why can't I find anyone who use the KCL with Lambda. What is the issue here?
Upvotes: 8
Views: 4502
Reputation: 417
Since you can directly consume from Kinesis in your lambdas (using Kinesis as event source) it doesn't make any sense to use KCL within lambda. The event source framework that AWS has built must be using something like KCL to bring lambdas up in response to kinesis events.
It would be super weird to bring up a lambda, initialize KCL in the handler and wait for events during the lambda runtime. Lambda will go down in 5 mins and you'll again do the same thing. Doing this from an EC2 instance makes sense but then you're reimplementing the Lambda - Kinesis integration by yourself. That is what Lambda is, behind the scene.
Upvotes: 10
Reputation: 799
I do not work for AWS, so obviously I do not know the exact reason why there is no documentation, but here are my thoughts.
First of all, to run the KCL, you need to have the JVM running. This means you can only do this in a lambda using Java because (to my knowledge at this point) there is no way to pull in other sdk, runtimes, etc into a lambda. You chose one runtime at setup. So already they would only be creating documentation for just java lambdas.
Now for the more technical reason. You need to think about what a lambda is doing, and then what the KCL is doing.
Let's start with the Lambda. Lambdas are by design, ephemeral. They can (and will) spin up and go down consistently throughout the day. Of course, you could set up a warming scheme so the lambdas stay up, but they will still have the ephemeral nature to them and this is completely out of your control. In other words, AWS controls when and if a lambda stays active and the exact methods for this is not published. So you can only try to keep things warmed.
What does a KCL do?
After reading through this list, lets now go back to the ephemeral nature of lambdas. This means that every single time a lambda comes up or goes down, all of this work needs to happen. This includes a complete rebalance between the shards and workers, pulling data records from the streams, setting checkpoints, etc. You would also need to make sure that you don't ever have more lambdas spun up than the number of shards as they would be worthless (never used in the best case or registered as workers in the worst case potentially causing lost messages. Think what would happen in this scenario with a rebalance.)
OK, technically could you pull it off? If you used Java and you did everything in your power to keep your lambdas warm, it could technically be possible. But back to your question. Why is there no documentation? I never want to say 'never', but generally speaking, Lambdas, with their ephemeral nature, are just not the best use case for the KCL. And if you don't go deep into the weeds on how the KCL works, you'll probably miss something, causing rebalancing issues and potentially causing messages to get lost.
If there is anything inaccurate here please let me know so I can update. Thanks and I hope this helps somebody.
Upvotes: 7