Producer-consumer processing pattern for Kafka processing

Question

I'm implementing a streaming pipeline that resembles the illustration below:

*K-topic1* ---> processor1 ---> *K-topic2* ---> processor2 --> 
*K-topic3* ---> processor3 --> *K-topic4*

The K-topic components represent Kafka topics and the processor components code (Python/Java).

For the processor component, the intention is to read/consume data from the topic, perform some processing/ETL on it, and persist the results to the next topic in the chain as well as persistent store such as S3.

I have a question regarding the design approach.

The way I see it, each processor component should encapsulate both consumer and producer functionality.

Would the best approach be to have a Processor module/class that could contain KafkaConsumer and KafkaProducer classes ? To date, most examples I've seen have separate consumer and producer components which are run separately and would entail running double the number of components as opposed to encapsulating producers & consumers within each Processor object.

Any suggestions/references are welcome.

This question is different from

Designing a component both producer and consumer in Kafka

as that question specifically mentions using Samza which is not the case here.

Producer-consumer processing pattern for Kafka processing

Answers (1)

Related Questions