Reputation: 1304
Based on Kinesis documentation, sequence number is supposed to be unique, however we see the same value being reused across multiple records. Our event producer is Spring Boot application that uses KPL internally, consumers are AWS lambdas. We have performed a re-sharding a couple times during the test. Below you can see sample sequence number reused more than once. How that's even possible?
"Records": [{
"kinesis": {
"kinesisSchemaVersion": "1.0",
"partitionKey": "00000000000000002",
"sequenceNumber": "49596124085897508159438713510240079964989152308217511954",
"data": "************************",
"approximateArrivalTimestamp": 1558991793.009
},
"eventSource": "aws:kinesis",
"eventVersion": "1.0",
"eventID": "shardId-000000000001:49596124085897508159438713510240079964989152308217511954",
"eventName": "aws:kinesis:record",
"invokeIdentityArn": "-----------------",
"awsRegion": "us-east-1",
"eventSourceARN": "-----------------"
}, {
"kinesis": {
"kinesisSchemaVersion": "1.0",
"partitionKey": "00000000000000003",
"sequenceNumber": "49596124085897508159438713510240079964989152308217511954",
"data": ""************************",",
"approximateArrivalTimestamp": 1558991793.009
},
"eventSource": "aws:kinesis",
"eventVersion": "1.0",
"eventID": "shardId-000000000001:49596124085897508159438713510240079964989152308217511954",
"eventName": "aws:kinesis:record",
"invokeIdentityArn": "-----------------",
"awsRegion": "us-east-1",
"eventSourceARN": "-----------------"
}, {
"kinesis": {
"kinesisSchemaVersion": "1.0",
"partitionKey": "00000000000000004",
"sequenceNumber": "49596124085897508159438713510240079964989152308217511954",
"data": ""************************",",
"approximateArrivalTimestamp": 1558991793.009
},
"eventSource": "aws:kinesis",
"eventVersion": "1.0",
"eventID": "shardId-000000000001:49596124085897508159438713510240079964989152308217511954",
"eventName": "aws:kinesis:record",
"invokeIdentityArn": "-----------------",
"awsRegion": "us-east-1",
"eventSourceARN": "-----------------"
}]
Upvotes: 1
Views: 1040
Reputation: 1304
When Kinesis stream writers use KPL with user record aggregation (see Consumer De-aggregation) user records are batched together and delivered as a single Kinesis record to regular Kinesis consumers. Kinesis record sequence numbers are unique in this case, but we need to implement de-aggregation.
However, in case the enhanced fan-out is enabled for Lambdas, user records are delivered as individual Kinesis records (no de-aggregation is required) and they share the same sequence number.
So the Kinesis record sequence number is not always unique.
Upvotes: 1