Ryan
Ryan

Reputation: 10101

What are the common practice to handle Cassandra write failure?

In the doc [1], it was said that

if using a write consistency level of QUORUM with a replication factor
of 3, Cassandra will send the write to 2 replicas. If the write fails on
one of the replicas but succeeds on the other, Cassandra will report a 
write failure to the client. 

So assume only 2 replicas receive the update, the write failed. But due to eventually consistency, all the nodes will receive the update finally.

So, should I retry? Or just leave it as it?

Any strategy?

[1] http://www.datastax.com/docs/1.0/dml/about_writes

Upvotes: 3

Views: 2514

Answers (2)

Maciej Miklas
Maciej Miklas

Reputation: 3331

Retry will not change much. The problem is that you actually cannot know whether data was persisted at all, because Cassandra throws always the same exception.

You have few options:

  • enable hints and retry request with cl=any - successful response would mean that at least hint was created. So you know that data is there but not yet accessible.
  • disable hints and retry with one - successful response would mean that at least node could receive data. In case of error execute delete.
  • use astyanax and their retry strategy
  • update to Cassandra 1.2 and use write-ahead log

Upvotes: 0

Richard
Richard

Reputation: 11100

Those docs aren't quite correct. Regardless of the consistency level (CL), writes are sent to all available replicas. If replicas aren't available, Cassandra won't send a request to the down nodes. If there aren't enough available from the outset to satisfy the CL, an UnavailableException is thrown and no write is attempted to any node.

However, the write can still succeed on some nodes and an error be returned to the client. In the example from [1], if one replica is down before the write was attempted, what is written is true.

So assume only 2 replicas receive the update, the write failed. But due to eventually consistency, all the nodes will receive the update finally.

Be careful though: a failed write doesn't tell you how many nodes the write was made to. It could be none so the write may not propagate eventually.

So, should I retry? Or just leave it as it?

In general you should retry, because it may not be written at all. You should only regard your write as written when you got a successful return from the write.

If you're using counters though you should be careful with retries. Because you don't know if the write was made or not, you could get duplicate counts. For counters, you probably don't want to retry (since more often than not the write will have been made to at least one node, at least for higher consistency levels).

Upvotes: 3

Related Questions