del bao
del bao

Reputation: 1174

How can HBase strongly consistent without consensus algorithm like paxos

Many articles refer that HBase is a strongly consistent system because the read/write only goes to the primary region server.

But I am thinking a scenario that the consistency can not hold

(1) a write failed to be replicated to some HDFS replicas (afaik, HBase replication relies on HDFS) but succeeded on some others and the primary responds a failure to the client.

(2) then the primary failed and a new leader got elected, which happened to have the success write in step (1).

The client will get uncommitted data, which breaks the strong consistency guarantee.

Upvotes: 2

Views: 1168

Answers (2)

Justin Lin
Justin Lin

Reputation: 761

You are right that HBase is not real "strong consistency". They claim it's "strong consistency" because of the master/slave replication where a client can only read/write from the master, which guarantees to have the latest write. So HBase folks categorize this as strong. Actually, the consistency of this type is still weak consistency (one of the failure examples is the scenario you described)

Ryan Barret from Google had a talk back in 2009 to explain the difference and M/S is an eventual consistency model, cite the diagram here enter image description here and more details in this book chapter

Upvotes: 1

Ashu Pachauri
Ashu Pachauri

Reputation: 1403

You are mistaken on the HBase write consistency guarantees. First of all, the scenario that you pose is a durability concern, not a consistency issue. Even after the supposedly uncommited write becomes visible, it will become visible to all clients at the same time; thus, no consistency issues here.

In other words; a write acknowledgement is only a Best effort and the real consistency guarantee is the read-after-write consistency. After a write is successfully written to the WAL by HBase, any client trying to read will see the same state of data.

To address the durability concerns, let me pose a different scenario from the one you have suggested. Let's take the following sequence of steps for any database (not just HBase):

  1. Client sends a write op.
  2. Server receives the op, successfully applies it and sends acknowledgement to the client.
  3. A network failure occurs and the acknowledgement never reaches the client.
  4. Due to network failure, client thinks that the request timed out. It also thinks that server connection is unhealthy. So it tries to create a new connection.
  5. Network recovers, but the acknowledgement from the server is lost as the client is now connected using a different socket.

The reason such a scenario can happen is due to the nature of most distributed systems i.e. in a client-server scenario, server is always considered to be the source of truth. Supposedly failed writes can appear because those writes might be visible as failed to the client due to reasons outside the control of the server.

Most databases only guarantee durability of writes successfully acknowledged to the client (i.e. once acknowledged, those writes cannot be lost unless a disaster scenario occurs), not the non-durability of writes that might have failed from the client perspective.

The only way to ensure that any writes not successfully acknowledged to the client are not written to the database is to wait for the client to acknowledge a write acknowldgedment from the server. That's a deadly dependency for a server to have where it can potentially block writes from all clients due to one misbehaving, slow or dead client.

Upvotes: 0

Related Questions