Reputation: 739
The Kafka documentation states that:
Consumers label themselves with a consumer group name, and each message published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.
If all the consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers.
If all the consumer instances have different consumer groups, then this works like publish-subscribe and all messages are broadcast to all consumers.
I've a couple of doubts regarding this:
1) Why will the published message go to a single consumer instance of a consumer group? Isn't it the responsibility of consumers to read from the partitions? What does go even mean here?
2)The consumers which are interested in particular topics should just read from the partition they're interested in. What's the relevance of consumer groups?
3) And how does this help to realize the abstraction of queue and publisher-subscirber?
Upvotes: 1
Views: 217
Reputation: 2286
In Kafka a topic can have multiple partitions, if a consumer group has X number of consumers, the partitions for that topic will be split among the consumers. (i.e: if you have 1 topic with 2 partitions, and you have a consumer group with 2 consumers, each consumer will consume from 1 partition, in the same scenario if the consumer group only has 1 consumer, that consumer will read from 2 partitions) The consumer group basically coordinates (is a coordinator) the different consumers with the topic/s and partitions. If you have 4 consumers in the same CG and 1 crashes the consumer group will give the partitions of the crashed consumer to the other consumers available in the same CG so the information in those partition is processed (if the CG didn't redistribute the different partitions some of the partitions will never be read if a consumer crashes).
If the consumers are in the same CG then the information that is sent to the topic is distributed among them. If each of the consumers has a different CG then they will all get all the messages.
Hope it's more clear now, the Kafka documentation needs improvement.
Upvotes: 2