Reputation: 1003
I am a newbie in Distributed systems and I am trying to get an insight on the concept of CRDT. I realize that it has three notations :
Conflict-free Replicated Data Type
Convergent Replicated Data Type
Commutative Replicated Data Type
Can anyone give an example where we use CRDT in distributed systems? Thanks a lot in advance.
Upvotes: 28
Views: 10243
Reputation: 4464
CRDTs use math to enforce consistency across a distributed cluster, without having to worry about consensus and associated latency/unavailability.
The set of values that a CRDT can take at any time come under the category of a semi-lattice (specifically a join semi-lattice), which is a POSET (partially-ordered set) with a least upper bound (LUB) function.
In simple terms, a POSET is a collection of items in which not all are comparable. For example, in an array of pairs: {(2,4), (4, 5), (2, 1), (6, 3)}
, (2,4)
is < (4,5)
, but can't be compared with (6,3)
(since one element is larger and the other smaller). Now, a semi-lattice is a POSET in which given 2 pairs, even if you can't compare the two, you can find an element greater than both (LUB).
Another condition is that updates to this datatype need to be increasing, CRDTs have monotonically increasing state, where clients never observe state rollback.
This excellent article uses the array I used above as an example. For a CRDT maintaining those values, if 2 replicas are trying to achieve consensus between (4,5)
and (6,3)
, they can pick a LUB = (6,5)
as consensus and assign both replicas to it. Since the values
are increasing, this is a good value to settle on.
There are 2 ways for CRDTs to keep in sync with each other across replicas, they can transfer state across periodically (convergent replicated data type), or they can transfer updates (deltas) across as they get them (commutative replicated data type). The former takes more bandwidth.
SoundCloud's Roshi is a good example (though no-longer in development it seems). They store data associated with a timestamp, where the timestamp is obviously incrementing. Any updates coming in with a timestamp lesser or equal than the one stored is discarded, which ensures idempotency (repeated writes are OK) and commutativity (out of order writes are ok). Commutativity is a=b means b=a
, which in this case means update1
followed by update2
is same as update2
followed by update1
.
Writes are sent to all clusters, and if certain nodes fail to respond due to an issue like slowness or partition, they're expected to catch up later via a read-repair
, which ensures that the values converge. The convergence can be achieved via 2 protocols as I mentioned above, propagate state or updates to other replicas. I believe Roshi does the former. As part of the read-repair
, replicas exchange state, and because data adheres to the semi-lattice property, they converge.
PS. Systems using CRDTs are eventually consistent, i.e they adopt AP (highly available and partition-tolerant) in the CAP theorem.
Another excellent read on the subject.
Upvotes: 16
Reputation: 554
CRDTs are inspired by the work of Marc Shapiro. In distributed computing, a conflict-free replicated data type (abbreviated CRDT) is a type of specially-designed data structure used to achieve strong eventual consistency (SEC) and monotonicity (absence of rollbacks). There are two alternative routes to ensuring SEC: operation-based CRDTs and state-based CRDTs.
CRDTs on different replicas can diverge from one another but at the end they can be safely merged providing an eventually consistent value. In other words, CRDTs have a merge method that is idempotent, commutative and associative.
The two alternatives are equivalent, as one can emulate the other, but operation-based CRDTs require additional guarantees from the communication middleware. CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronization. This would lead to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings.
Some examples where they are used
Riak is the most popular open source library of CRDT's and is used by Bet365 and League of Legends. Below are some useful links that supports Riak.
1- Bet365 (Uses Erlang and Riak) http://www.erlang-factory.com/static/upload/media/1434558446558020erlanguserconference2015bet365michaelowen.pdf
2- League of Legends uses the Riak CRDT implementation for its in-game chat system (which handles 7.5 million concurrent users and 11,000 messages per second)
3- Roshi implemented by SoundCloud that supports a LWW time-stamped Set: -Blog post: https://developers.soundcloud.com/blog/roshi-a-crdt-system-for-timestamped-events
Upvotes: 39
Reputation: 2606
Those three expansions of the acronym all mean basically the same thing.
A CRDT is convergent if the same operations applied in a different sequence produces (converges to) the same result. That is, the operations can be commutated - it's a commutative RDT. The reason that the operations can be applied in a different sequence and still get the same result is that the operations are conflict-free.
So CRDT means the same thing, whichever of the three expansions you use - though personally I prefer "Convergent".
Upvotes: 3