Reputation: 11
In Kafka producer, I am sending two different sets of data. I have two partitions for the topic. The first one is with a key and the second one is without a key. As far as I know the key is used to make partitions for the data. If the key is absent, null will be sent and the partition will be happening by round-robin scheduling.
But the question is if I am sending the data with and without key alternatively for some particular period of time, what will happen?
Will round robin scheduling happen for the partitions excluding the partition made by using key or will it happen for the all the two partitions?
Upvotes: 1
Views: 1026
Reputation: 1144
I want to correct you. You said that the key is used to make partitions for the data. The key with a message is basically sent to get the message ordering for a specific field.
Explain and example
Upvotes: 1
Reputation: 3832
Kafka select partition as per defined below rules
a. if the key is null then partition selected on round-robin.
b. If the key is non-null keys then It uses Murmur2 hash with modulo to identify partitions for the topic.
So message with key (null or not null) would get published on both partitions using Default Partitioner with no Custom Partitioner defined.
To achieve a message publish in a specific partition you can use the below method.
Pass partition explicitly while publishing a message
/** * Creates a record to be sent to a specified topic and partition */ public ProducerRecord(String topic, Integer partition, K key, V value) { this(topic, partition, null, key, value, null); }
You can create Custom Partitioner and implement logic to select the partition
https://kafka.apache.org/10/javadoc/org/apache/kafka/clients/producer/Partitioner.html
Upvotes: 3
Reputation: 395
Kafka has a very organized scenario when it comes to sending and storing the records in the partitions. As you have mentioned, the Key is used for the purpose that the same key records go to the same partition. This helps in maintaining the chronology of those messages on that topic.
In your case, the two partitions will store the data as:
Upvotes: -1