Corehacker
Corehacker

Reputation: 277

Concurrent writes to cassandra replicas - Is duplication possible?

I have a two machine cluster which is running Cassandra 1.2.6. I am using a keyspace which has a replication factor of 2. But my application demands me to write to both the replicas in parallel and also let the Cassandra do the replication and hoping that Cassandra does not duplicate the key/value on the replica nodes.

For example:

Ideally this should store around 400MB on disk with some overhead for storing keys which should be marginal compared to the value sizes that I using.

Observations:

My question is, is the behavior (15% overhead) expected? Is there any configuration that we need to tweak so that Cassandra properly handles concurrent writes to all the replicas.

Thanks!

Upvotes: 1

Views: 968

Answers (1)

Richard
Richard

Reputation: 11110

There are two possible causes of the 15% extra space that I can think of.

One is because sometimes a replica will store two copies of a column temporarily. If you write a column twice in Cassandra at slightly different times, the two copies may go into separate memtables so end up in separate SSTables on disk. At some point later, when the SSTables get merged through the compaction process, the older value will be discarded, freeing up the space. In your test you could run nodetool compact to force compaction and see if the space usage goes down.

Another possible cause depends on how you did the test when you didn't write to both nodes. If you did this at consistency level ONE, it is possible some of the writes were dropped by the other replica, so it doesn't have all the keys yet. You can be sure it does by running nodetool repair. So the space used in your first observation may not be for all the keys.

You should be aware that writing to all replicas at consistency level ONE does not guarantee that each replica holds a copy. The node that is receiving the data does not have to store it to return success for the write, even if it is a replica. It may be overloaded (in your workload, this would most likely be due to not enough I/O to write the data out) and drop the write, while succeeding in writing it to a different replica. This would cause less space to be used in your second observation, but probably isn't happening in your test since it is a relatively small amount of data.

If you need to guarantee you have two copies you should write at consistency level ALL and only write it once.

Upvotes: 2

Related Questions