user2105282
user2105282

Reputation: 804

Kafka use case to send data to external system

Studying kafka in the documentation I found next sentence:

Queuing is the standard messaging type that most people think of: messages are produced by one part of an application and consumed by another part of that same application. Other applications aren't interested in these messages, because they're for coordinating the actions or state of a single system. This type of message is used for sending out emails, distributing data sets that are computed by another online application, or coordinating with a backend component.

It means that Kafka topics aren't suitable for streaming data to external applications. However, in our application, we use Kafka for such purpose. We have some consumers which read messages from Kafka topics and try to send them to an external system. With such approach we have a number of problems:

  1. Need a separet topic for each external application (assume that the number of external application numbers > 300, doesn't suite well)

  2. Messages to an external system can fail when the external application is unavailable or for some another reason. It is incorrect to keep retrying to send the same message and not to commit offset. Another way there is no nicely configured log when I can see all fail messages and try to resend them.

What are other best practice approach to stream data to an external application? OR Kafka is a good choice for the purpose?

Upvotes: 0

Views: 1724

Answers (1)

Katya Gorshkova
Katya Gorshkova

Reputation: 1561

Just sharing a piece of experience. We use Kafka extensively for integrating external applications in the enterprise landscape.

  1. We use topic-per-event-type pattern. The current number of topics is about 500. The governance is difficult but we have our own utility tool, so it is feasible.
  2. Where possible we extend an external application to integrate with Kafka. So the consumers become a part of the external application and when the application is not available they just don't pull the data. If the extension of the external system is not possible, we use connectors, which are mostly implemented by us internally. We distinguish two type of errors: recoverable and not recoverable. If the error is not recoverable, for example, the message is corrupted or not valid, we log the error and commit the offset. If the error is recoverable, for example, the database for writing the message is not available then we do not commit the message, suspend consumers for some period of time and after that period try again. In your case it is probably makes sense to have more topics with different behavior (logging errors, rerouting the failed messages to different topics and so on)

Upvotes: 1

Related Questions