Reputation: 85
I am trying to use Kinesis client library to consume data from Kinesis data stream. I am trying to familiarize with concepts of Kinesis. It no where talks about Lease concept, but directly jumps into use of Lease.
Any explanation in simple terms on what exactly is Lease in Kinesis
Upvotes: 4
Views: 6091
Reputation: 113
At a high level, a DynamoDB table is used to keep track of your Kinesis application streams state.
The 'LeaseKey' is a hash of the Kinesis shard id and this is used as the hashkey in the DynamoDb table.
So, in other words, when your stream is processing there is a row for every shard in a corresponding dynamoDB table. These rows contain information relating to the current state of processing of that shard... and this is known as lease information.
You can see the full table schema and meta data in regards to what each lease columns means in the table here:
https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html
Upvotes: 3
Reputation: 539
In computer science, a Lease is a contract that gives its holder specified rights to some resource for a limited period. Because it is time-limited, a lease is an alternative to a lock for resource serialization.
— from Lease (computer science) - Wikipedia
This concept (lease is by default a temporary arrangement) is opposed to the traditional lock mechanism what is indefinite until explicitly removed.
In this case, lease holder is a worker of a consumer app, and resource is a shard within the data stream. The current snapshot of the lease (kinesis application streams state as stated above) is within the DynamoDB Lease Table and it contains the following fields
FIELD | COMMENT |
---|---|
checkpoint |
The most recent checkpoint sequence number for the shard. This value is unique across all shards in the data stream. |
checkpointSubSequenceNumber |
When using the Kinesis Producer Library's aggregation feature, this is an extension to checkpoint that tracks individual user records within the Kinesis record. |
leaseCounter |
Used for lease versioning so that workers can detect that their lease has been taken by another worker. |
leaseKey |
A unique identifier for a lease. Each lease is particular to a shard in the data stream and is held by one worker at a time. |
leaseOwner |
The worker that is holding this lease. |
ownerSwitchesSinceCheckpoint |
How many times this lease has changed workers since the last time a checkpoint was written. |
hashrange |
Used by the PeriodicShardSyncManager to run periodic syncs to find missing shards in the lease table and create leases for them if required. |
childshards |
Used by the LeaseCleanupManager to review the child shard's processing status and decide whether the parent shard can be deleted from the lease table. |
shardID |
The ID of the shard. |
streamname |
The identifier of the data stream in the following format account-id:StreamName:streamCreationTimestamp |
— from What Is a Lease Table
It is clear that a shard can be leased by 1 worker only, whereas 1 worker can (and should in case there are >1 shards) lease multiple shards (where I am 3:1 ratio is recommended).
Upvotes: 0