sc so
sc so

Reputation: 333

Having multi-threaded Kafka Consumer per partition, is it possible and recommended, if so any sample snippet?

We are using Kafka 0.9 version and are having large number of messages pushed to specific partition within a kafka topic. And there are multiple such partitions within this topic. We have one consumer assigned per partition within this topic, and we are maintaining offset manually within the topic-partition in an outside datastore. I wanted to know if we start getting really large number of messages in a topic-partition, is it possible to have consumer dealing with a topic-partition to be multi-threaded. Because it might not be possible that a consumer instance assigned to a partition is able to finish processing all the records in the time-span we want it finish. Is it possible to have such kind of multi-threaded consumer with a partition. Is it recommended? Also if the answer is YES, then how multiple threads can do offset management, because all these threads are dealing with messages within the same partition. Any sample snippet available?

Please note: I am asking about "consumer dealing with a single partition within a topic", I am not looking at the consumer group for a topic across partitions in it.

Upvotes: 3

Views: 4178

Answers (1)

Nautilus
Nautilus

Reputation: 2286

What you can do, is have 1 Thread handling the consumption of messages for that partition and a bunch of workers processing this messages. There are a few problems that you face with this solution (for example the process of the messages may not be in order) and also fails and retries you need to handle them separately because the offset commit will be done by the consumer when (1) it hands the messages to the workers or (2) the workers will have to notify the consumer when the processing is finish so the consumer can keep a list (ordered by offset asc) and the offset would be commited only when its the next offset after the last commited offset (and removed from the list). This solution has the downside that there can be several offsets waiting to commit because there is a slow worker(or a heavy message) and if the consumer goes down next time you will reprocess messages (either way there are workarounds for this).

Hope it helps!

Upvotes: 3

Related Questions