twobiers
twobiers

Reputation: 1298

How to deal with concurrent events in an event-driven architecture

Suppose I have a eCommerce application designed in an event-driven architecture. I would publish events like ProductCreated and ProductPriceUpdated. Typically both events are published in seperate channels.

Now a consumer of those events comes into play and would react on these, for example to generate a price-chart for specific products.

In fact this consumer has the requirement to firstly consume the ProductCreated event to create a Product entity with the necessary information in its own bounded context. Only if a product has been created price points can be added to the chart. Depending on the consumers performance it can easily happen that those events arrive "out-of-order".

What are the possible strategies to fulfill this requirement?

The following came to my mind:

  1. Publish both events onto the same channel with ordering guarantees. For example in Kafka both events would be published in the same partition. However this would mean that a topic/partition would grow with its events, I would have to deal with different schemas and the documentation would grow.
  2. Use documents over events. Simply publishing every state change of the product entity as a single ProductUpdated event or similar. This way I would lose semantics from the message and need to figure out what exactly changed on consumer-side.
  3. Defer event consumption. So if my consumer would consume a ProductPriceUpdated event and I don't have such a product created yet, I postpone the consumption by storing it in a database and come back at a later point or use retry-topics in Kafka terms.
  4. Create a minimal entity. Once I receive a ProductPriceUpdated event I would probably have a correlation id or something to identify the entity and simple create a Entity just with this id and once a ProductCreated event arrives fill in the missing information.

Upvotes: 0

Views: 1100

Answers (1)

ChristDist
ChristDist

Reputation: 768

Just thought of giving you some inline comments, based on my understanding for your requirements (#1,#3 and #4).

  1. Publish both events onto the same channel with ordering guarantees. For example in Kafka both events would be published in the same partition. However this would mean that a topic/partition would grow with its events, I would have to deal with different schemas and the documentation would grow.

[Chris] : Apache Kafka preserves the order of messages within a partition. But, the mapping of keys to partitions is consistent only as long as the number of partitions in a topic does not change. So as long as the number of partitions is constant, you can be sure the order is guaranteed. When partitioning keys is important, the easiest solution is to create topics with sufficient partitions and never add partitions.

  1. Defer event consumption. So if my consumer would consume a ProductPriceUpdated event and I don't have such a product created yet, I postpone the consumption by storing it in a database and come back at a later point or use retry-topics in Kafka terms.

[Chris]: If latency is not of a concern, and if we are okay with an additional operation overhead of adding a new entity into your solution, such as a storage layer, this pattern looks fine.

  1. Create a minimal entity. Once I receive a ProductPriceUpdated event I would probably have a correlation id or something to identify the entity and simple create a Entity just with this id and once a ProductCreated event arrives fill in the missing information.

[Chris] : This is kind of a usual integration pattern (Messaging Later -> Backend REST API) we adopt, works over a unique identifier, in this case a correlation id.

This can be easily acheived, if you have a separate topics and consumer per events and the order of messages from the producer is gaurenteed. Thus, option #1 becomes obsolete.

From my perspective, option #3 and #4 look one and the same, and #4 would be ideal.

On an another note, if you thinking of KAFKA Streams/Table into your solution, just go for it, as there is a stronger relationship between streams and tables is called duality.

Duality of streams and tables makes your application to support more elastic, fault-tolerant stateful transactions and to run interactive queries. And, KSQL add more flavour into it, because, this use is just of of Data Enrichment at the integration layer.

Upvotes: 1

Related Questions