alext
alext

Reputation: 802

Kafka Consumer Producer API, Kafka Streams and Exactly Once Processing

I have been learning Kafka in the past month and although they are many articles and videos on the topic - i still cannot understand Exactly Once Processing well enough.
Let me sum up what i have understood so far:

  1. Kafka Streams can be used to do some ETL or Analytics - this means that we can write Java code that can do data enrichment, transformations and this will be plugged in to Kafka itself?
  2. In order to achieve exactly once processing, in Kafka Consumer Producer API, it is only possible if we are reading a message from one topic, and then writing to another topic in the same cluster. This will work with Kafka Transactions (like committing the offset and making the message visible on the output topic). As soon as we introduce third party system (like Redis, Database [Postgres, Mysql]) - this is not possible - because failures can occur at any time (worst case scenario). If we want to handle this - we would have to use inbox pattern - where we read the message from the topic, write it to Database, and than it gets processed by a worker lets say - also we need to manage the offsets as well.
  3. Building on top of 2) - how does Kafka Streams come in the whole picture - i assume if this fits into the Kafka itself, the processing in Kafka Streams, would support Transactions. But this still does not help with writing to third party system (like Redis, Database, etc).

Am i understanding the things correctly?
I would be glad to understand this better so i can make better decisions in the future.
Best Regards

Upvotes: 1

Views: 411

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

Kafka Streams wraps the consumer and producer API, it therefore cannot offer any better semantics.

If you want to integrate with external systems, you'd use Kafka Connect, Spark, Flink, etc. ideally, rather than wrap your Kafka consumer code in an external database library client transaction. Or, as you mentioned, you can use inbox/outbox design patterns

Upvotes: 2

Related Questions