Reputation: 473
I have a Kafka cluster which receives messages from a source based on data changes in that source. In some cases the messages are meant to be processed in the future. So I have 2 options:
Option 1 is easier to execute but my question is: Is Kafka a durable data store? And has anyone done this sort of eventing with Kafka? Are there any gaping holes in the design?
Upvotes: 3
Views: 4137
Reputation: 465
You can configure the amount of time your messages stay in Kafka (log.retention.hours).
But keep in mind that Kafka is meant to be used as a "real-time buffer" between your producers and your consumers, not as durable data store. I don't think Kafka+Storm would be the appropriate tool for your use case. Why not just write your messages in some distributed file system, and schedule a job (MapReduce, Spark...) to process those events?
Upvotes: 1