mofury
mofury

Reputation: 725

Parallel Producing and Consuming in Kafka

1. Consuming concurrently on the same topic and same partition

Suppose I have 100 partitions for a given topic (e.g. Purchases), I can easily consume these 100 partitions (e.g. Electronics, Clothing, and etc...) in parallel using a consumer group with 100 consumers in it.

However, that is assigning one consumer to each subset of the total data on Purchases. What if I want just want to consume one subset of data with 100 consumers concurrently? For example, for all of my consumers, they just want to know Electronics partition of the Purchases topic.

Is there way they can consume this partition concurrently?

In general I just want all my consumers to receive the same data set concurrently.

From the information I've gathered, it seems to me that consumers CANNOT consume from replicas: Consuming from a replica

Can I produce the same data to multiple topics, like Purchase-1[Electronics] and Purchase-2[Electronics] so then I can consume them concurrently? Is this a recommended approach?

2. Producing concurrently on the same topic and same partition

When multiple producers are producing to the same topic and same partition, since we can only write to the partition leader and replicas are only there for fault-tolerance, does this mean there isn't any concurrency? (i.e. each commit must wait in line.)

Upvotes: 1

Views: 3820

Answers (2)

Antony Stubbs
Antony Stubbs

Reputation: 13585

If you want to consumer from a single partition in parallel, use something like Parallel Consumer (PC).

By using PC, you can process all your keys in parallel, regardless of how long it takes, and you can be as concurrent as you wish.

PC directly solves for this, by sub partitioning the input partitions by key and processing each key in parallel. It also tracks per record acknowledgement. Check out Parallel Consumer on GitHub (it's open source BTW, and I'm the author).

Upvotes: 0

vahid
vahid

Reputation: 1208

  1. If those 100 consumers belong to different consumer groups, they can consume from the same topic and partition simultaneously. In that case, you need to make sure each consumer is able to handle the load from the 100 partitions.
  2. Producers can produce to the same topic partition at the same time, but the actual order of messages written to the partition is determined by the partition leader.

Upvotes: 2

Related Questions