Reputation: 805
I need to make Kafka consumers process all the messages with the same ID in each partition at once. For example, consider one topic containing all orders with different types and there are multiple consumer instances subscribing to this topic. How can I run consumers to process all the messages in each partition with the same Id? Because when the orders are produced with that Id, although Kafka guarantees that all same IDs go to the same partition, but each partition may contain different orders. I need to process all the similar orders in each partition at once(not one by one) and once in a while(not as soon as a new message arrives).
Upvotes: 0
Views: 268
Reputation: 191671
As the comments say, you'll need to manually batch your data into "bins per ID", then process those on your own. For example, write each record to a database, group by ID, then iterate/process each batch.
As far as Kafka is concerned, you're required to look at each event "one by one", but this does not require you to "handle them" in that order, unless you care about sequential processing, at least once processing, and in-order offset commits.
There's also no way to get "all unique ids" in any partition without consuming the whole partition end-to-end. You could use Kafka Streams aggregate function to help with this, and punctuate to periodically handle all gathered IDs up to a certain point, as one other solution.
Upvotes: 1