Reputation: 3294
As per reading the kafka connect documentation:
https://docs.confluent.io/5.3.3/connect/userguide.html#distributed-mode
config.storage.topic=connect-configs
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-configs --replication-factor 3 --partitions 1 --config cleanup.policy=compact
offset.storage.topic=connect-offsets
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-offsets --replication-factor 3 --partitions 50 --config cleanup.policy=compact
status.storage.topic=connect-status
bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-status --replication-factor 3 --partitions 10 --config cleanup.policy=compact
I understand why connect-configs has only one partition, it has to be unique partition, ok. But i don´t understand and i can not get information on why connect-offsets should have 50 partitions and connect-status 10
Upvotes: 0
Views: 3073
Reputation: 1856
That is just a guess but partitions spread the load on a topic.
I don't know what is the exact function of each of these topics but if I had to guess I'd say that configs
is probably not continuously accessed since looks like it stores configurations. status
most likely is periodically updated but not as often of offsets
. And offsets
is updated by source connectors all the time.
So maybe the documentation creates the topics with these number of partitions based on expected load, and it explicitly sets them to not rely on the default number of partitions when creating a topic.
For example in here it says
offset.storage.partitions
The number of partitions used when Connect creates the topic used to store connector offsets. A large value (e.g., 25 or 50, just like Kafka’s built-in __consumer_offsets topic) is necessary to support large Kafka Connect clusters.
and
status.storage.partitions
The number of partitions used when Connect creates the topic used to store connector and task status updates.
Default: 5
which defaults to a small number
Upvotes: 1