Reputation: 29
I've been reading up about event sourcing, about keeping events logs and recreating the application state when server starts up.
What happens when you have to horizontally scale your system? Won't there be a need for a synchronisation service between all the servers to maintain the sanity of the data as event processed by one server should update my application state across the multiple instances?
How do we achieve consistency or am I missing some concept here? Or can these systems only have eventual consistency?
Can I build systems that require high level of consistency like a double entry ledger application using event sourcing?
Upvotes: 0
Views: 1210
Reputation: 7842
There are various design patterns like Saga pattern, CQRS pattern, Database per service, Event Sourcing pattern, and Shared DB per service pattern. You may need to chose either of these or combination of these as per your need /system architecture, use case, software components and requirement in hand.
In general, the Audit logging microservice pattern can be checked. All the modifications to application state or transactions are stored as a sequence of events via Event Sourcing that relies on the event object for recording/updation of the state change. The current state of the application and additional context for how the application arrived at that state can be visualized with this pattern and thus it enables in re-construction of application state at any point of time.
The usage may need Saga pattern (dependent on your use case) to maintain consistency across microservices. If transactions span multiple services, Saga pattern can be tried where a event or message is published to trigger the next transaction step (Here sequence of transactions updates each service). Saga also acts as a fault handler, by having a compensation logic by sending rollback events. Saga pattern can be done using an Orchestrator where a separate service will take care of business logic by communicating / sequencing with the other services on the local transactions that will be executed. Saga pattern can also be done via Choreography where without central co-ordination, every service will listen for update of transaction in other services and accordingly trigger local transaction and update other service accordingly.
This in-turn can be made flexible, scalable and easier to maintain while coupled with CQRS pattern that shall decouple read from write workloads whereby the Data is stored as a series of events, instead of direct updates to data stores.
AWS supports this pattern via Amazon Kinesis Data Streams or Amazon EventBridge. . In Kinesis Data Stream based solution depicted below, the Kinesis Data streams acts as key entity of event store that captures application changes as events and persists them on Amazon S3.
In EventBridge based solution depicted below, the EventBridge is used as an event store and facilitates a default event bus for events that are published by AWS services, and you can also create a custom event bus for domain-specific buses.
In the case of GCP, pub/sub is used as a ledger which acts as an append-only record of events published on an event bus or a message queue of some kind and an eventing system can be used to trigger Cloud Functions of interest (archiever, replayer, janitor, bqloader or other) every time a record is written to the stream and these Cloud Functions shall be scalable based on the volume of requests without any intervention.
Kafka supports the event sourcing pattern with the topic and partition. There are multiple approaches in in which events shall be stored. There is all-entity type whereby you can store all events for all entity types in a single topic (partitioned) and there is an option of "Topic-per-entity-type", where there shall be a separate topic created for all events related to a particular module/entity and there is the option of "Topic-per-entity", for each product/user, a separate topic are created. The Kafka Streams applications run across a cluster of nodes, which jointly consume some topics. Kafka Streams supports folding a stream into a local store. The local store implementation can be either as a persistent storage engine (RocksDB, used by default) or an in-memory store. Accordingly, you can fold the stream of events into the state store, keeping locally the “current state” of each entity from the partitions assigned to the node. In case of usage of RocksDB for state store reaches disk space limitation based on number of entities, scaling can be mitigated by further partitioning the source topic. Kafka Connect connector can be used which will handle all of the topic-consuming details and help in storing the events or data from events in ElasticSearch or PostgreSQL. Also, the Kafka Streams stage can be used for handling scenarios whereby one event triggers other events in your system. It should also be noted that Kafka was not originally designed for event sourcing, however its design as a data streaming engine with replicated topics, partitioning, state stores and streaming APIs is very flexible and lends itself perfectly to the task.
Upvotes: 1
Reputation: 10215
I'm adding this as an answer because it's clearer than trying to squeeze it into a comment. Levi sounds like he has experience in this (event sourcing) - I don't.
The short answer: look for design patterns that solve your problem. (I'd tell you what they were if I knew them :) But you can bet that you're not the first person to have this challenge.
Another thing you can do, if you have selected a technology stack, is to see what specific tools and design patterns they have. E.g. I have a Kafka book at work which talks about this sort of stuff but in a Kafka specific way; would bet any amount of money that AWS has similar guidance, etc.
Anyway, here's some stuff that might help:
Upvotes: 0
Reputation: 20561
For strong consistency in a horizontally-scalable event-sourced system, you generally will shard (partition) the entities whose state is to be event-sourced among the instances of the service. Requests which need to be strongly consistent within an entity (or an aggregate if we're talking domain driven design) are then routed to the correct instance; there's typically also going to be a coordination component responsible for relocating shards in response to instance failures.
If doing this, it's useful to define which requests are strongly consistent (i.e. need to be routed based on the sharding) and which can be eventually consistent (can be handled by a view over the event stream): since the latter doesn't require any synchronization.
I've personally used Akka Persistence for defining event-sourced actors to manage instances of aggregates in applications and then used Akka Cluster Sharding to manage distributing those actors across a cluster and request routing, but there are other systems that implement similar functionality.
Upvotes: 1