user1189332
user1189332

Reputation: 1941

Kafka consume from 2 topics and take equal number of messages

I've jumped into a specific requirement and would like to hear people's views and certainly not re-invent the wheel.

I've got 2 Kafka topics - A and B.

A and B would be filled with messages at different ingest rate. For example: A could be filled with 10K messages first and then followed by B. Or in some cases we'd have A and B would be filled with messages at the same time. The ingest process is something we have no control of. It's like a 3rd party upstream system for us.

I need to pick up the messages from these 2 topics and mix them at equal proportion. For example: If the configured size is 50. Then I should pick up 50 from A and 50 from B (or wait until I have it) and then send it off to another kafka topic as 100 (with equal proportions of A and B).

I was wondering what's the best way to solve this? Although I was looking at the join semantics of KStreams and KTables, I'm not quite convinced that this is a valid use case for join (cause there's no key in the message that joins these 2 streams or tables).

Can this be done without Kafka Streams? Vanilla Kafka consumer (perhaps with some batching?) Thoughts?

Upvotes: 1

Views: 438

Answers (1)

Gary Russell
Gary Russell

Reputation: 174769

With Spring, create 2 @KafkaListeners, one for A, one for B; set the container ack mode to MANUAL and add the Acknowledgment to the method signature.

In each listener, accumulate records until you get 50 then pause the listener container (so that Kafka won't send any more, but the consumer stays alive).

You might need to set the max.poll.records to 1 to better control consumption.

When you have 50 in each; combine and send.

Commit the offsets by calling acknowledge() on the last Acknowledgment received in A and B.

Resume the containers.

Repeat.

Deferring the offset commits will avoid record loss in the event of a server crash while you are in the accumulating stage.

When you have lots of messages in both topics, you can skip the pause/resume part.

Upvotes: 3

Related Questions