Reputation: 43
Context
We have a distrubuted system. We emit events from one of those systems which are read from another system for report generation.
Logical order is ensured by the fact that even if the emitter system has N nodes there is a finite state machine underlined which makes impossible to have concurrent emission of an event for one aggregate. These events are marked with a timestamp. N nodes could not always be on synch about the time.
We care so much about timestamp because the down-stream system which generates reports needs quite always a timestamp because "Reporting people" care about this kind data to check things are going the right way.
The problem
The fact 2 nodes could have a little discrepancy is making us thinking. Let's imagine the next example.
The logical order of the events is this:
Event 1 => Event 2 => Event 3
But in the Database we could have this situation:
-------------------------------------------
| Name | TimeStamp | Logical Order |
-------------------------------------------
| Event 1 | 2 | 1 |
| Event 2 | 1 | 2 |
| Event 3 | 3 | 3 |
-------------------------------------------
Has you can see, Event 2 is logically happened after the Event 1 but their timestamp could not be on synch.
Ok, this is not going to happen every 2 seconds but it could happen because the timestamp comes from different nodes. And from a Reporting point of view this is an anomaly.
Possible solutions
Have you got experiences on this topic?
Upvotes: 4
Views: 1241
Reputation: 1274
Based on your question, I assume the timestamp is being generated before the event is read by the finite state machine. I'd suggest you to sort your events by timestamp instead of using the logical order
. When working on distributed systems, it's recommended to have one, and just one, way to sort events.
With regard to distributed, sequential ids generation, I recommend you to take a look at this answer and to snowflake, which is mentioned in the previous link. The later provides a distributed service that you can use as a centralized marker generator. The ids generated by snowflake are a composition of: timestamp, worker number and sequence number.
TL;DR
If the timestamp is reliable enough to guarantee events order, I'd suggest you to use that one instead of the logical order
, which I'm assuming is generated after the timestamp was.
Hoe this helps
Upvotes: 1
Reputation: 61
If you can ensure causality relationship and have a partial order, i don't see many problems in presenting a "useful business representation" with modified timestamp. I think that underlying distributed architecture is out of context for business domain.
They probably understand the system as a whole, and forcing a shift in their mental model may cause some friction.
On the other side i would not normalize timestamp on the log, you can use that to track clock drifts between subsystems.
Upvotes: 2