sdgfsdh
sdgfsdh

Reputation: 37095

Exactly once consumer / producer in Kafka .NET?

Suppose I have a Kafka consumer/producer that works as follows:

  1. Consume a message from input topic
  2. Compute a function of the message
  3. Write the result to output topic
  4. Commit the message from input
  5. Repeat (1)

The computation in step (2) might fail, but if it succeeds then it will always produce the same result.

Note that "auto-commit" is disabled.

Here is some code, in case that helps:

#r "nuget: Confluent.Kafka, 2.8.0"

open Confluent.Kafka
open System.Threading
open System.Threading.Tasks

let processor
  (expensiveComputation : string -> Task<int>)
  (consumer : IConsumer<string, string>)
  (producer : IProducer<string, int>)
  (outputTopic : string)
  (ct : CancellationToken)
  =
  task {
    while true do
      // Consume the next message
      let! consumeResult = Task.Run (fun () -> consumer.Consume(ct))

      printfn $"Consumed %A{consumeResult.TopicPartitionOffset}"

      // Compute something (might fail!)
      let! computed = expensiveComputation consumeResult.Message.Value

      // Write the result
      let message = Message()

      message.Key <- consumeResult.Message.Key
      message.Value <- computed

      let! topicPartitionOffset = producer.ProduceAsync(outputTopic, message)

      printfn $"Produced %A{topicPartitionOffset}"

      // Commit
      consumer.Commit(consumeResult)

      printfn $"Committed %A{consumeResult.TopicPartitionOffset}"
  }

If stages 2-4 fail, then the process will restart and continue from where it failed.

This gives "at least once" processing.

Is it possible to make this "exactly once" processing in Kafka, such that the consumer only commits when the result has been successfully written to the output topic?

Upvotes: 0

Views: 47

Answers (1)

Guru Stron
Guru Stron

Reputation: 142873

Have not tried this myself yet, but Kafka has transactional semantics and exactly-once guarantees in some cases.

From Exactly-Once Semantics Are Possible: Here’s How Kafka Does It blog post:

Note that exactly-once semantics is guaranteed within the scope of Kafka Streams’ internal processing only; for example, if the event streaming app written in Streams makes an RPC call to update some remote stores, or if it uses a customized client to directly read or write to a Kafka topic, the resulting side effects would not be guaranteed exactly once.

Stream processing systems that only rely on external data systems to materialize state support weaker guarantees for exactly-once stream processing. Even when they use Kafka as a source for stream processing and need to recover from a failure, they can only rewind their Kafka offset to reconsume and reprocess messages, but cannot rollback the associated state in an external system, leading to incorrect results when the state update is not idempotent.

There is also ExactlyOnce example in the .NET Confluent Kafka repo (readme, the part about exactly once procesing). In short - you disable autocommit and use IProducer's transactions API.

See also:

Upvotes: 0

Related Questions