Otto
Otto

Reputation: 3294

Why kafka connect internal topic connect-offsets has 50 partitions and connect-status has 10?

As per reading the kafka connect documentation:

https://docs.confluent.io/5.3.3/connect/userguide.html#distributed-mode

config.storage.topic=connect-configs

bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-configs --replication-factor 3 --partitions 1 --config cleanup.policy=compact

offset.storage.topic=connect-offsets

bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-offsets --replication-factor 3 --partitions 50 --config cleanup.policy=compact

status.storage.topic=connect-status

bin/kafka-topics --create --zookeeper localhost:2181 --topic connect-status --replication-factor 3 --partitions 10 --config cleanup.policy=compact

I understand why connect-configs has only one partition, it has to be unique partition, ok. But i don´t understand and i can not get information on why connect-offsets should have 50 partitions and connect-status 10

Upvotes: 0

Views: 3073

Answers (1)

Gerard Garcia
Gerard Garcia

Reputation: 1856

That is just a guess but partitions spread the load on a topic.

I don't know what is the exact function of each of these topics but if I had to guess I'd say that configs is probably not continuously accessed since looks like it stores configurations. status most likely is periodically updated but not as often of offsets. And offsets is updated by source connectors all the time.

So maybe the documentation creates the topics with these number of partitions based on expected load, and it explicitly sets them to not rely on the default number of partitions when creating a topic.

For example in here it says

offset.storage.partitions

The number of partitions used when Connect creates the topic used to store connector offsets. A large value (e.g., 25 or 50, just like Kafka’s built-in __consumer_offsets topic) is necessary to support large Kafka Connect clusters.

and

status.storage.partitions

The number of partitions used when Connect creates the topic used to store connector and task status updates.

Default: 5

which defaults to a small number

Upvotes: 1

Related Questions