Reputation: 258
I am continuously publishing data into dynamoDB which has stream enabled. I am reading this stream using DynamoDB apadter of KCL.
I am using 1 KCL worker with 5 leases. At the time of creation my Dynamo table had 1 partition(1 RCU and 999WCU). When I keep publishing the data into dynamo, the no of partitions will grow and the no of active shards too. Reading is fine till the no of active shards are 5. As soon as it crosses 5, KCL is not able to read from one of the shards(tps is getting dropped).
Is there any config/parameter that I can set that will allow me to read from growing shards using fixed no of leases ?
Upvotes: 0
Views: 622
Reputation: 16215
You're looking for the maxLeasesPerWorker property.
From the javadoc:
Worker will not acquire more than the specified max number of leases even if there are more shards that need to be processed. This can be used in scenarios where a worker is resource constrained or to prevent lease thrashing when small number of workers pick up all leases for small amount of time during deployment.
Make sure to take note of the warning in the javadoc as well:
Note that setting a low value may cause data loss (e.g. if there aren't enough Workers to make progress on all shards). When setting the value for this property, one must ensure enough workers are present to process shards and should consider future resharding, child shards that may be blocked on parent shards, some workers becoming unhealthy, etc.
Upvotes: 1