Achraf Oussidi
Achraf Oussidi

Reputation: 109

Does Hbase have a replication policy of its own or is it inherited from HDFS?

Since HBase is built on top of HDFS which has a replication policy for fault tolerance, does this mean HBase is inherently fault tolerant and data stored in HBase will always be accessible thanks to the underlying HDFS? Or does HBase implement a replication policy of its own (e.g table replication over regions)?

Upvotes: 2

Views: 677

Answers (2)

Gyanendra Dwivedi
Gyanendra Dwivedi

Reputation: 5557

The concept of replication in HBase is different than HDFS replication. Both are different in different context. HDFS is the file system and replicates data for fault tolerant and high availability features from the data file. While HBase replication is mainly around fault tolerant, high availability and data integrity from a database system perspective.

Of course, HDFS replication capability is used for file level replication for HBase. Along with it, HBase also maintains copies of its meta data into backup nodes (which are again replicated by default by HDFS).

HBase also have backup processes to monitor and recover from failure. like Primary and Secondary Region servers. But the data loss in the region server is protected by HDFS replication only.

Hence, the Hbase replication is mainly around recovery of failure and maintaining data integrity as a database engine. It is just like any other robust database system like Oracle.

Upvotes: 2

Saurabh
Saurabh

Reputation: 73669

Yes, you can create replica of regions in Hbase, as mentioned here. However, note that HBase high availability is for read only. It is not highly available for writes. If region server goes down, then until regions are assigned to a new region server, you will not be able to write.

To enable read replicas, you need to enable Async WAL replication by setting hbase.region.replica.replication.enabled to true. You will also need to enable high availability for the table at creation time by specifying REGION_REPLICATION value greater than 1, as in docs:

CREATE 't1', 'f1', {REGION_REPLICATION => 2}

More details can be found here.

Upvotes: 3

Related Questions