Reputation: 2166
In an event sourced system, historic data in the form of events is never thrown away. Doing so could result in a corrupted state. Now imagine there is a court ruling, stating some data needs to be deleted (for example, search engines had to delete privacy specific data). How would you achieve this?
Upvotes: 2
Views: 211
Reputation: 57279
That's a really good question.
So far, I've learned of two possibilities.
Easy part first: if you are using event sourcing, then all of your views of your data should be derivable from the events in your event store. Therefore, all of the data that you have stored for reading (caches, screens, projections, reports) can be blown away and regenerated after you scrub the tainted data from the event store.
So you only need to figure out that part.
First, if the tainted data never gets into the store, you don't have to worry about scrubbing it out. For instance, sensitive information can be isolated in a key value store; references to that data in the event store are always by surrogate key. When you need to scrub, the data in the key value store is nuked, you have a bunch of events that point to something no longer readable, and you just need to ensure that your read models can continue to function if the referenced data is not available.
If the data does need to get into the event store -- because it's needed to maintain the integrity of the domain model -- then the idea of "aggregates" may be able to help.
Aggregates is an idea taken from ddd, the basic idea is that your domain can be decomposed into elements that don't need to share data directly. On aggregate never references data within another directly; instead you use indirect references by ID; the ID itself being another surrogate key.
Since these aggregates are isolated from each other, they can have their own event history. In which case you can scrub the tainted data by simply eliminating any aggregates that have been contaminated. You just delete the event streams.
A response like this doesn't put you in a corrupted state, just an inconsistent one. Everything still runs, there's just a bunch of data missing.
There's also the weapon of a "compensating event" available in the toolkit; you might be able to introduce a new stream of events that brings the system back to a consistent state. For example, if scrubbing a bunch of transactions takes the books out of balance, you may be able to publish an event that creates a charge against iCouldTellYouButThen....
Upvotes: 2