Akash Jain
Akash Jain

Reputation: 277

Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams?

I wanted to know why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams as is given in the documentation in below URL: enter link description here

Upvotes: 7

Views: 4602

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62350

As the name "co-partition" indicates, you want to put data from different topics but same key to the same Kafka Streams application instance. If you don't have the same number of partitions, it's not possible to get this behavior.

Assume you have topic A with 2 partitions and topic B with 3 partitions. Thus, it can happen that one record with key X is hashed to partitions A-0 and B-1 (ie, not same partition number). However, for a different key Y it might be hashed to A-0 but B-2.

Only if the number of partitions is the same for both topics, records with same key end up in the same partitions (of different topics of course), and this allows to process A-0/B-0 and A-1/B-1 etc together.

Upvotes: 12

Related Questions