Reputation: 87
i am trying to make my head regarding Kafka consumers and I'd like to know if the following use case can be solved using Kafka.
My use case is basically this one:
I have a stream that I'd like to be consumed in sync by several consumers. In other words, I have a first consumer that starts to consume the stream, then another consumer arrives later. I'd like this second consumer to start to consume the stream at the offset where is currently the first consumer.
I know that I need to have the consumers in two different groups. But it is not clear for me :
Upvotes: 0
Views: 1552
Reputation: 3264
This is not a problematic treated by kafka directly (consumer group is to distribute partitions among members, not to attribute the same offset), but you can do somehting for this. You could simply create an other topic, where consumer1 would post either offset or copy of the message read (so you would need bth consumer and producer for this), and your other synchronized consumer would react against this - of course there ould be some latency for this.
What is your use case behind this? Why can't you consume at different offset? Couldn't you rather having one consumer, which would then dispatch the message read to to different processes, so that they are indeed synchronized? (with no latency)
What do you mean by synchronized: should consumer2 (and 3 and more) only consume the same message than consumer1 (ie can't consume faster, what I assume in both previous solution) While this is possible, it would really be better to know the reason behind this, maybe there is a better way for you to process data
Upvotes: 0
Reputation: 2430
You do not need two different groups, all consumers can check one topic. Or as many as they like, for that matter.
offset
Messages typically are identified by their arrival date, so all the clients need to tell the producer "my last visit was at 10:00, give me all new messages". So all each client needs to keep track of is when which individual topic was checked last.
latency
this is kind of "of scope" at this point. Of course there will be latency, but it depends on the environment, like "how many consumers", "how many topics", "message format" etc.
so can your usecase be solved using kafka
In short: yes. "Can one consumer continue where another has left", the consumers could exchange the latest index between each other, of course that would require some internal synchronization. Kafka itself does not care about consumers, so it will not keep track itself about the latest index. You need to do the work. Another possibility would be to actually consume the messages (like, delete them from queue once consumed), so each time another consumer hits the queue it is guaranteed to receive the messages another consumer left off. Of course that would depend on your usecase, can you actually delete your messages from the queue.
Upvotes: 1