Reputation: 13
In the Kafka technical literature, it is often touted as a benefit that the event stream can be re-played and state restored. In the Kafka environment, this data stream is kept by default for one week, therefore only in this time slice I can recover the state and this should be considered during the system design. If the business requires a long period of time, the service should explicitly store the state in a KTable. Literally the stream is not event sourcing or it is event sourcing but in a small slice of time. And this chunk of time needs to be communicated with the business (per use case). Am I understanding the idea correctly? Are there technical details I'm missing? Could you share some use case where re-playing KStream could be a benefit?
Upvotes: 0
Views: 227
Reputation: 19640
Event Sourcing, by definition, states that the state is not persisted as such but is restored from events when an operation is executed on an entity. There's no concept of entity streams in Kafka, and it cannot control the entity state version, as messages are produced to topics.
What you describe is a reporting model built from events, which is not Event Sourcing; it's just CQRS. Especially when you use Kafka Connect and typed topics with schema registry, all your "events" would be snapshots of state.
Upvotes: 1
Reputation: 192023
If the business requires a long period of time, the service should explicitly store the state in a KTable
True, though KTable isn't required. You can source events into Kafka then use Kafka Connect to dump them into some final destination while old data is removed from the broker.
Alternatively, KTables are built upon compacted topics which have infinitely long retention, since the first event that was sent, so no "time window" is necessary
One example of where replay might be beneficial is if the topology of the Streams application changes in any way, or the schema of your records need modified.
Upvotes: 0