Reputation: 21
There are several applications which have to be integrated together and they have to exchange Issues. So one of them will get the issue and then do something and later on change the Status of this Issue. And the other applications which could be involved to this Issue should get the new Information. This continues until the Issue reaches the final Status Closed. The Problem is the Issue have to be mapped, because these applications do not all support the same Data Format.
Thanks for your advice
Upvotes: 1
Views: 384
Reputation: 8335
You can do this either way. If you send the whole Issue and then publish all subsequent updates to the same issue as Kafka messages that contain a common kafka message key (perhaps a unique issue ID number) then you can configure your kafka topic as a compacted topic and the brokers will automatically delete any older copies of the data to save disk space.
If you chose to only send deltas (changes) then you need to be careful to have a retention period that’s long enough so that the initial complete record will never expire while the issue is still open and publishing updates. The default retention period is 7 days.
Yes. In Kafka Connect via Single Message Transforms (SMT), or in Kafka Streams using native Streams code (in Java).
You can configure kafka for large messages but if they are much larger than 5 or 10 MB then it’s usually better to follow a claim check pattern and store them external to Kafka and just publish a reference link back to the externally stored data so the consumer can retrieve the attachment out of band from Kafka.
Upvotes: 0
Reputation: 32090
Yes it does make sense.
Kafka can do transformations through both the Kafka Streams API, and KSQL which is a streaming SQL engine built on top of Kafka Streams.
Typically Kafka is used for smaller messages; one pattern to consider for larger content is to store it in an object store (e.g. S3, or similar depending on your chosen architecture) and reference a pointer to it in your Kafka message.
Upvotes: 1