RavenMan
RavenMan

Reputation: 1933

Aerospike - Read (with consistency level ALL) when one replica is down

TL;DR
If a replica node goes down and new partition map is not available yet, will a read with consistency level = ALL fail?

Example:

Given this Aerospike cluster setup:
- 3 physical nodes: A, B, C
- Replicas = 2
- Read consistency level = ALL (reads consult both nodes holding the data)

And this sequence of events:
- A piece of data "DAT" is stored into two nodes, A and B
- Node B goes down.
- Immediately after B goes down, a read request ("request 1") is performed with consistency ALL.
- After ~1 second, a new partition map is generated. The cluster is now aware that B is gone.
- "DAT" now becomes replicated at node C (to preserve replicas=2).
- Another read request ("request 2") is performed with consistency ALL.

It is reasonable to say "request 2" will succeed.

Will "request 1" succeed? Will it:
a) Succeed because two reads were attempted, even if one node was down?
b) Fail because one node was down, meaning only 1 copy of "DAT" was available?

Upvotes: 4

Views: 325

Answers (1)

kporter
kporter

Reputation: 2768

Request 1 and request 2 will succeed. The behavior of the consistency level policies are described here: https://discuss.aerospike.com/t/understanding-consistency-level-overrides/711.

The gist for read/write consistency levels is that they only apply when there are multiple versions of a given partition within the cluster. If there is only one version of a given partition in the cluster then a read/write will only go to a single node regardless of the consistency level.

  1. So given an Aerospike cluster of A,B,C where A is master and B is replica for partition 1.
  2. Assume B fails and C is now replica for partition 1. Partition 1 receives a write and the partition key is changed.
  3. Now B is restarted and returns to the cluster. Partition 1 on B will now be different from A and C.
  4. A read arrives with consistency all to node A for a key on Partition 1 and there are now 2 versions of that partition in the cluster. We will read the record from nodes A and B and return the latest version (not fail the read).

Time lapse

  1. Migrations are now complete, for partition 1, A is master, B is replica, and C no longer has the partition.
  2. A read arrives with consistency all to node A. Since there is only one version of Partition 1, node A responds to the client without consulting node B.

Upvotes: 4

Related Questions