Reputation: 109
Since HBase
is built on top of HDFS
which has a replication policy for fault tolerance, does this mean HBase
is inherently fault tolerant and data stored in HBase
will always be accessible thanks to the underlying HDFS
? Or does HBase
implement a replication policy of its own (e.g table replication over regions)?
Upvotes: 2
Views: 677
Reputation: 5557
The concept of replication in HBase
is different than HDFS
replication. Both are different in different context. HDFS
is the file system and replicates data for fault tolerant and high availability features from the data file. While HBase
replication is mainly around fault tolerant, high availability and data integrity from a database system perspective.
Of course, HDFS
replication capability is used for file level replication for HBase
. Along with it, HBase
also maintains copies of its meta data into backup nodes (which are again replicated by default by HDFS
).
HBase
also have backup processes to monitor and recover from failure. like Primary and Secondary Region servers. But the data loss in the region server is protected by HDFS
replication only.
Hence, the Hbase
replication is mainly around recovery of failure and maintaining data integrity as a database engine. It is just like any other robust database system like Oracle
.
Upvotes: 2
Reputation: 73669
Yes, you can create replica of regions in Hbase, as mentioned here. However, note that HBase high availability is for read only. It is not highly available for writes. If region server goes down, then until regions are assigned to a new region server, you will not be able to write.
To enable read replicas, you need to enable Async WAL replication by setting hbase.region.replica.replication.enabled
to true. You will also need to enable high availability for the table at creation time by specifying REGION_REPLICATION value greater than 1, as in docs:
CREATE 't1', 'f1', {REGION_REPLICATION => 2}
More details can be found here.
Upvotes: 3