Reputation: 13272

How to handle hard aggregate-wide constraints in DDD/CQRS?

I'm new to DDD and I'm trying to model and implement a simple CRM system based on DDD, CQRS and event sourcing to get a feel for the paradigm. I have, however, run in to some difficulties that I'm not sure how to handle. I'm not sure if my difficulties stem from me not having modeled the domain properly or that I'm missing something else.

For a basic illustration of my problems, consider that my CRM system has the aggregate CustomerAggregate (which seems reasonble to me). The purpose of this aggregate is to make sure each customer is consistent and that its invarints hold up (name is required, social security number must be on the correkct format, etc.). So far, all is well.

When the system receives a command to create a new customer, however, it needs to make sure that the social security number of the new customer doesn't already exist (i.e. it must be unique across the system). This is, of cource, not an invariant that can be enforced by the CustomerAggregate aggregate since customers don't have any information regarding other customers.

One suggestion I've seen is to handle this kind of constraint in its own aggregate, e.g. SocialSecurityNumberUniqueAggregate. If the social security number is not already registered in the system, the SocialSecurityNumberUniqueAggregate publishes an event (e.g. SocialSecurityNumberOfNewCustomerWasUniqueEvent) which the CustomerAggregate subscribes to and publishes its own event in response to this (e.g. CustomerCreatedEvent). Does this make sense? How would the CustomerAggregate respond to, for example, a missing name or another hard constraint when responding to the SocialSecurityNumberOfNewCustomerWasUniqueEvent?

Upvotes: 1

Answers (2)

Andreas Hütter

Reputation: 3919

Do you also need to verify that the social security number (SSN) is really valid? Or are you just interested in verifying that no other customer aggregate with the same SSN can be created in your CRM system?

If the latter is the case I would suggest to have some CustomerService domain service which performs the whole SSN check by looking up the database (e.g. via a repository) and then creates the new customer aggregate (which again checks it's own invariants as you already mentioned). This whole process - the lookup of existing SSN and customer creation - needs to happen within one transaction to to ensure consistency. As I consider this domain logic a domain service is the perfect place for it. It does not hold data by itself but orchestrates the workflow which relates to business requirements - that no to customers with the same SSN must be created in our CRM.

If you also need to verify that the social security number is real you would also need to perform some call the another service I guess or keep some cached data of SSNs in your CRM. In this case you could additonally have some SocialSecurityNumberService domain service which is injected into the CustomerService. This would just be an interface in the domain layer but the implementation of this SocialSecurityNumberService interface would then reside in the infrastructure layer where the access to whatever resource required is implemented (be it a local cache you build in the background or some API call to another service).

Either way all your logic of creating the new customer would be in one place, the CustomerService domain service. Additional checks that go beyond the Customer aggregate boundaries would also be placed in this CustomerService.

Update

To also adhere to the nature of eventual consistency:

I guess as you go with event sourcing you and your business already accepted the eventual consistency nature. This also means entries with the same SSN could happen. I think you could have some background job which continually checks for duplicate entries and depending on the complexity of your business logic you might either be able to automatically correct the duplicates or you need human intervention to do it. It really depends how often this could really happen.

If a hard constraint is that this must NEVER happen maybe event sourcing is not the right way, at least for this part of your system...

Note: I also assume that command de-duplication is not the issue here but that you really have to deal with potentially different commands using the same SSN.

Upvotes: 0

VoiceOfUnreason

Reputation: 57279

The search term you are looking for is set-validation.

Relational databases are really good at domain agnostic set validation, if you can fit the entire set into a single database.

But, that comes with a cost; designing your model that way restricts your options on what sorts of data storage you can use as your book of record, and it splits your "domain logic" into two different pieces.

Another common choice is to ignore the conflicts when you are running your domain logic (after all, what is the business value of this constraint?) but to instead monitor the persisted data looking for potential conflicts and escalate to a human being if there seems to be a problem.

You can combine the two (ex: check for possible duplicates via query when running the domain logic, and monitor the results later to mitigate against data races).

But if you need to maintain an invariant over a set, and you need that to be part of your write model (rather than separated out into your persistence layer), then you need to lock the entire set when making changes.

That could mean having a "registry of SSN assignments" that is an aggregate unto itself, and you have to start thinking about how much other customer data needs to be part of this aggregate, vs how much lives in a different aggregate accessible via a common identifier, with all of the possible complications that arise when your data set is controlled via different locks.

There's no rule that says all of the customer data needs to belong to a single "aggregate"; see Mauro Servienti's talk All Our Aggregates are Wrong. Trade offs abound.

One thing you want to be very cautious about in your modeling, is the risk of confusing data entry validation with domain logic. Unless you are writing domain models for the Social Security Administration, SSN assignments are not under your control. What your model has is a cached copy, and in this case potentially a corrupted copy.

Consider, for example, a data set that claims:

000-00-0000 is assigned to Alice
000-00-0000 is assigned to Bob

Clearly there's a conflict: both of those claims can't be true if the social security administration is maintaining unique assignments. But all else being equal, you can't tell which of these claims is correct. In particular, the suggestion that "the claim you happened to write down first must be the correct one" doesn't have a lot of logical support.

In cases like these, it often makes sense to hold off on an automated judgment, and instead kick the problem to a human being to deal with.

Although they are mechanically similar in a lot of ways, there are important differences between "the set of our identifier assignments should have no conflicts" and "the set of known third party identifier assignments should have no conflicts".

Upvotes: 1

How to handle hard aggregate-wide constraints in DDD/CQRS?

Answers (2)

Related Questions