Natan
Natan

Reputation: 4958

How to ensure external projections are in sync when using CQRS and EventSourcing?

I'm starting a new application and I want to use cqrs and eventsourcing. I got the idea of replaying events to recreate aggregates and snapshotting to speedup if needed, using in memory models, caching, etc.

My question is regarding large read models I don't want to hold in memory. Suppose I have an application where I sell products, and I want to listen to a stream of events like "ProductRegistered" "ProductSold" and build a table in a relational database that will be used for reporting or integration with another system. Suppose there are lots of records and this table may take from a few seconds to minutes to truncate/rebuild, and the application exports dozens of these projections for multiple purposes.

How does one handle the consistency of the projections in this scenario?

With in-memory data, it's quite simple and fast to replay the events. But I feel that external projections that are kept in disk will be much slower to rebuild.

  1. Should I always start my application with a TRUNCATE TABLE + rebuild for every external projection? This seems impractical to me over time, but I may be worried about a problem I didn't have yet.

  2. Since the table is itself like a snapshot, I could keep a "control table" to tell which event was the last one I handled for that projection, so I can replay only what's needed. But I'm worried about inconsistencies if the application or database crashes. It seems that checking the consistency of the table and rebuilding would be the same, which points to the solution 1 again.

How would you handle that in a way that is maintainable over time? Are there better solutions?

Thank you very much.

Upvotes: 1

Views: 2585

Answers (1)

Phil Sandler
Phil Sandler

Reputation: 28036

One way to handle this is the concept of checkpointing. Essentially either your event stream or your whole system has a version number (checkpoint) that increments with each event.

For each projection, you store the last committed checkpoint that was applied. At startup, you pull events greater than the last checkpoint number that was applied to the projection, and continue building your projection from there. If you need to rebuild your projection, you delete the data AND the checkpoint and rerun the whole stream (or set of streams).

Caution: the last applied checkpoint and the projection's read models need to be persisted in a single transaction to ensure they do not get out of sync.

Upvotes: 6

Related Questions