Reputation: 3931
I'm using the CachedSchemaRegistryClient
and the register
method, which takes a subject and Avro
schema. I'm running these against the 5.2.1
confluent docker images, and when I register the schema I get back behavior that I find strange.
The first schema I register returns an id
of 81
(backed up by using the schema registry REST api to check that this schema is tied to this id
), and then the second schema returns and id of 121.
Since this behavior is unexpected and I have been unable to find an answer via Google, I'm curious if there is a hashing strategy or something similar to assign schema ids, I would expect it to start at 1 and increment.
Upvotes: 1
Views: 2422
Reputation: 39790
Confluent Documentation explains how unique IDs are assigned to schemas:
Schema Registry is a distributed storage layer for Avro Schemas which uses Kafka as its underlying storage mechanism. Some key design decisions:
- Assigns globally unique ID to each registered schema. Allocated IDs are guaranteed to be monotonically increasing but not necessarily consecutive.
- Kafka provides the durable backend, and functions as a write-ahead changelog for the state of Schema Registry and the schemas it contains.
- Schema Registry is designed to be distributed, with single-primary architecture, and ZooKeeper/Kafka coordinates primary election (based on the configuration).
Also,
Schema ID allocation always happens in the primary node and Schema IDs are always monotonically increasing.
If you are using Kafka primary election, the Schema ID is always based off the last ID that was written to Kafka store. During a primary re-election, batch allocation happens only after the new primary has caught up with all the records in the store
<kafkastore.topic>
.If you are using ZooKeeper primary election,
/<schema.registry.zk.namespace>/schema_id_counter
path stores the upper bound on the current ID batch, and new batch allocation is triggered by both primary election and exhaustion of the current batch. This batch allocation helps guard against potential zombie-primary scenarios, (for example, if the previous primary had a GC pause that lasted longer than the ZooKeeper timeout, triggering primary reelection).
Upvotes: 3