Handling large amounts of denormalized read model updates in CQRS

Question

I'm designing a CQRS event-sourced system (not my first) where my read models are denormalized and stored in a read-optimized document database (MongoDb). Nothing special. Now this particular read model, is a document that contains a user id and a potentially large array of groups that the user is a member of:

{
  "userId": 1,
  "userName": "aaron",
  "groups": [
    {
      "groupId": 1,
      "name": "group 1"
    },
    {
      "groupId": 2,
      "name": "group 2"
    }
  ]
}

There could be 10's of thousands of users that are a member of a single group (just as one example: imagine a group that every staff member is a member of).

Keeping in mind the reason I'm using CQRS in the first place, is that I need to scale my reads (or rather, handle my reads differently given the need to avoid lots of joins), and I'm expecting a significant volume of writes. This isn't the only reason I'm using CQRS and event-sourcing, but it is one major catalyst.

Now the problem I have is when someone updates a group name, (which I'm predicting will happen quite frequently) my read model needs updating. This means a single user modification to a single piece of data, is going to cause 10's of thousands of updates in my read store.

I am well aware of all the techniques I can apply to handle dispatching the update to avoid temporal coupling, however I am concerned with the number of documents that will be updated per single user modification.

I've read several SO answers that ask this exact type of question, and most answers suggest that either you need to strike a balance, or not worry about the mass updates. But IMO, this is not really an option. There really is no balance to be had in this type of read model (any re-modelling of the document would still necessitate the group name appearing just as many times, no matter how it's re-modelled), and simply accepting the mass quantities of updates is counter-productive to the idea of the super fast read store, as it will now be under severe load due to the constant updates that are almost always going to be queued up. Essentially what will happen, is the denormalizing process is going to bottleneck, and the queue is going to grow over time (until there is some respite from users updating group names), and reading will become slow as a side-effect.

Before anyone jumps on me and asks whether I know that this bottleneck will occur, the answer is "it should, but obviously I can't be sure". But, based on knowing how many changes are made in the existing system that I'm replacing, and, keeping in mind this is not the only type of model in the document database that will require updating, I have pretty good cause to be concerned. As I said, there are several other read models - that may not have the same number of updates - but will nevertheless add to the write load in the read store. And, the read store can only take so many writes.

I can think of two solutions (one dumb, one not so dumb):

Store a version in each document, and to not update the read model when an event occurs. Then when a read occurs for a particular document, I check for staleness, and if the version is stale (due to a proceeding command), I apply the last change to that document before storing and returning it. However, my instinct tells me that eventually every document is going to get updated regardless, and this is just adding additional overhead to the read. I also have no idea how the versioning would actually work
Use a relational read model and have the single join. This seems the most sensible option, as I would just update the join table, and all is good. But the reads wouldn't be as fast, and it just feels a bit more inferior to a pure select * from tablename approach.

My question:

Are there any standard techniques for combating this type of issue? Is the second option I offered simply the best I can hope for?

I would honestly have thought that this type of problem would occur all the time in a CQRS event-sourced systems, where denormalized data needs to be kept in sync, but there seems to be a lack of discussion about it in the community which leads me to believe I'm missing an obvious solution, or my read model needs improvement.

Alexey Zimarev · Accepted Answer

I think when you expect one user to be a member of 10s of thousands of groups, the model you have chosen is wrong. You need to remove the list of groups from the user document and stick to the relational model, only keeping group ids. Imagine your groups will need some more attributes than names and you will face the same issue again. And again.

Handling large amounts of denormalized read model updates in CQRS

Answers (1)

Related Questions