Reputation: 9237
I am looking into the Transactional producer that Kafka supports and the Exactly-Once processing described in these 2 links: 1) https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/ 2) https://www.confluent.io/blog/transactions-apache-kafka/
It seems very elegant solution for streaming scenarios where the consumer can poll the record (from topic A for example), process the record and publish to multiple output topics (say B and C) and the change-log topic. This can happen atomically as described in the links if transaction API for producer is used correctly.
Unfortunately, the definition of Kafka Producer includes the types of key and value. IProducer<Key, Value>
(https://docs.confluent.io/5.5.0/clients/confluent-kafka-dotnet/api/Confluent.Kafka.IProducer-2.html)
and if we have multiple schema definitions for various records in our topics we have a problem where we need to define one producer per schema. Given that choice, we can't have atomic transaction that spans multiple output sink topics.
Unless I am missing something, Transactional Kafka support for producer seems to be very restrictive. In production, it is very practical to define schema in Schema Registry to deal with evolving schema and forward and backward compatibility.
I thought of using IProducer<byte[], byte[]>
for the transactional producer and serialize/deserialize the message, but I am not sure if this is the best way to go about it. Is this the only viable solution with the current state of Kafka Transaction support?
Thanks in advance
Upvotes: 2
Views: 550
Reputation: 9357
If you have different types of data, you can write a common serializer that automatically serializes given data item as per the topic.
In the ISerializer you have the following method
byte[] Serialize(T data, SerializationContext context)
The SerializationContext
contains the topic
property. (Reference)
Alternatively, you can also make use of Headers
to store some important information on how to serialize them.
I don't know .NET, but I suppose you can write a wrapper class around each type that you want to produce and your actual object (data) will be a property in that class which you can get.
The actual object can then be serialized as per its type or the topic name or some info in the header.
Upvotes: 1