Cassandra write semantics

Question

In Cassandra architecture, when we perform a write operation, data is first written in commit log, then into memtable, and when memtable reaches threshold, data is flushed into SSTable.

So at a given time we have 2 copies of data in a given node: one copy is in commit log and another copt is either in memtable or flushed to SSTable.

So why do we need to have 2 copies? Isn't commit log enough for recovery purposes? Or do they serve totally different purposes? And how are these 3 different from each other?

bereal · Accepted Answer

When you write, Cassandra saves the data to both commit log and Memtable, that makes the operation very fast. If the node restarts before the data is saved to the persistent SSTable, the data in memory is lost, but can be recovered from the commit log.

So Cassandra uses Memtables and SStables for lookup, and commit logs allows restarting a node at any moment without losing the data.

Cassandra write semantics

Answers (1)

Related Questions