David
David

Reputation: 123

Matching Kafka consumer and producer partition

I am creating a system where a front-end service pushes messages to a Kafka 'request' topic and listens on another 'response' topic for some downstream back-end consumer (actually a complex system that eventually pushes back to Kafka) to do processing on the 'request' message and eventually push to the 'response' topic.

I am trying to figure out the most elegant way to make sure that the consumer listens on the appropriate partition and receives the response, and that the back-end pushes to the partition that the front-end consumer is listening on. We always need to ensure that the response goes to the same consumer that produced the initial message.

I have two solutions as of now, but neither is particularly satisfying. Any thoughts or ideas would be greatly appreciated:

  1. Have each front-end decide which partition it will listen on and pass that partition with the message to the 'request' topic. When the processing on the back-end is finished, it will look at the message's partition member and push to the appropriate partition. An immediate issue here is how to coordinate front-end services so that there's an even distribution on each partition (random assignment?).
  2. Each message has a correlation ID, a GUID, so for each request to our front-end, we could begin listening on a partition based on hashing the GUID to total number of partitions then push the message to the 'request' topic. The back-end would then look at the correlation ID to determine the appropriate partition to push to. An issue here is that for each request that comes in, the front-end must establish a new consumer on a new partition (is there overhead here?) and potentially will have multiple active consumers on the same partition as well as many active consumers across many partitions.
  3. Have a single consumer group with equal number of consumers and partitions, then go with a similar approach to (1), but allow Kafka to deal with which consumer is on which partition. But then we need to figure out what happens when rebalancing occurs, especially for messages already in flight in the back-end (as potentially all partitions could change?).

This seems like it should be a common pattern so am wondering how others have solved this.

Upvotes: 4

Views: 2023

Answers (3)

GuangshengZuo
GuangshengZuo

Reputation: 4677

An elegant way is that using the partition function in back-end producer and using the manual partition assignment assign for front-end consumer to listen only the interesting partition.

More detailed:

In front-end producer, before you produce the "request" message to "request" topic, set the message key to the front-end client id (it need to be unique).

In back-end consumer, There is no need to do manual partition assignment, just using the subscribe to subscribe the request topic. But worth to notice, when you get a 'request' message and process it, please do not lose the message key, keep it. because it identify that where the request come from.

In back-end producer, when you have the request process done, you generate a response message to reply, and set the response message key to the front-end client id which you keep above. And you also need to define a partition function (a hash function, mapping a client id to a partition number). Use the partition function to do send().

In front-end consumer, you need to use assgin() method to listen the specific partition. But how to know which partition should be listened? Just use its client-id(It will be same at a same client) and the same hash function you define above to calculate the partition number you should listen.

Upvotes: 0

Hans Jespersen
Hans Jespersen

Reputation: 8335

Sometimes the response doesn't have to go back to the original requesting application if it's possible to respond directly to a user request by sending Kafka response messages onto a Kafka Connector for direct external delivery via Webhooks, WebSocket, Email, or SMS text messages back to the original user.

If you are just looking to do SOAP or REST style RPC then just use HTTP instead of Kafka as that's a proven pattern.

Upvotes: 0

Daniel
Daniel

Reputation: 783

Please do not use consumers with manually assigned partitions. It can get really messy and it is hard to scale.

Instead of partitions you can use topic per front-end consumer. Each front-end service produces a message containing front-end service's id to request topic. Then back-end consumes the message and based on the id produces a response message to a particular unique-front-end-service-response topic. It can be a good solution if you have a constant number of front-end services. Possible disadvantage is creating a new topic every time you want to add a new front-end service. However it would be a lot easier to maintain than manual partition assignment.

Another possible solution could be using a different tool. If Kafka is not mandatory, please rethink your requirements and do a research. Probably there is a tool which fits your needs better than Kafka.

Upvotes: 5

Related Questions