How to check which partition is a key assign to in kafka?

Question

I am trying to debug a issue for which I am trying to prove that each distinct key only goes to 1 partition if the cluster is not rebalancing.

So I was wondering for a given topic, is there a way to determine which partition a key is send to?

OneCricketeer · Accepted Answer

As explained here or also in the source code

You need the byte[] keyBytes assuming it isn't null, then using org.apache.kafka.common.utils.Utils, you can run the following.

Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;

For strings or JSON, it's UTF8 encoded, and the Utils class has helper functions to get that.
For Avro, such as Confluent serialized values, it's a bit more complicated (a magic byte, then a schema ID, then the data). See Wire format

In Kafka Streams API, You should have a ProcessorContext available in your Processor#init , which you can store a reference to and then access in your Processor#process method, such as ctx.recordMetadata.get().partition() (recordMetadata returns an Optional)

only goes to 1 partition

This isn't a guarantee. Hashes can collide.

It makes more sense to say that a given key isn't in more than one partition.

if the cluster is not rebalancing

Rebalancing will still preserve a partition value.

How to check which partition is a key assign to in kafka?

Answers (2)

Related Questions