Karan Khanna
Karan Khanna

Reputation: 2137

KafkaConsumer Java API subscribe() vs assign()

I am new with Kafka Java API and I am working on consuming records from a particular Kafka topic.

I understand that I can use method subscribe() to start polling records from the topic. Kafka also provides method assign() if I want to start polling records from selected partitions of the topics.

I want to understand if this is the only difference between the two?

Upvotes: 26

Views: 26042

Answers (2)

Nerm
Nerm

Reputation: 200

I'd like to add some useful information specifically to a consumer without a group.id. There is no default to this property (given no framework shenanigans - KafkaClient lib + Java). It's not official, but they're typically called a free consumer. a free consumer doesn't subscribe to topics, so it's required to assign topic partitions.

As noted above, the concepts of automatic partition assignment, rebalancing, offset persistence, partition exclusivity, consumer heartbeating and failure detection / liveness (all the things that are gifted with a consumer group) are thrown out the window with these free consumers. As such, it's up to the client (you) to keep track of any state the app has in relation to kafka, and that includes keeping track of offsets (a Map, for instance). This is because a free consumer doesn't commit their offsets to Kafka, and usually your own storage mechanism is used.

Upvotes: 1

Ryuzaki L
Ryuzaki L

Reputation: 40048

Yes subscribe need group.id because each consumer in a group will dynamically assigned to partitions for list of topics provided in subscribe method and each partition can be consumed by one consumer thread in that group. This is achieved by balancing the partitions between all members in the consumer group so that each partition is assigned to exactly one consumer in the group

assign will manually assign a list of partitions to this consumer. and this method does not use the consumer's group management functionality (where no need of group.id)

The main difference is assign(Collection) will loose the controller over dynamic partition assignment and consumer group coordination

It is also possible for the consumer to manually assign specific partitions (similar to the older "simple" consumer) using assign(Collection). In this case, dynamic partition assignment and consumer group coordination will be disabled.

subscribe

public void subscribe(java.util.Collection<java.lang.String> topics)

The subscribe method Subscribe to the given list of topics to get dynamically assigned partitions. and if the given list of topics is empty, it is treated the same as unsubscribe().

As part of group management, the consumer will keep track of the list of consumers that belong to a particular group and will trigger a rebalance operation if one of the following events trigger -

Number of partitions change for any of the subscribed list of topics
Topic is created or deleted
An existing member of the consumer group dies
A new member is added to an existing consumer group via the join API

assign

public void assign(java.util.Collection<TopicPartition> partitions)

The assign method manually assign a list of partitions to this consumer. And if the given list of topic partitions is empty, it is treated the same as unsubscribe().

Manual topic assignment through this method does not use the consumer's group management functionality. As such, there will be no rebalance operation triggered when group membership or cluster and topic metadata change.

Upvotes: 31

Related Questions