lixinso
lixinso

Reputation: 793

What's the pros and cons to install HBase + Hadoop together vs. install HBase and Hadoop separately?

I mean , 2 options : 1. Install HBase on Hadoop cluster which is also do offline computing, so means only 1 hadoop cluster. 2. Install a Hadoop Cluster for Offline Computing , then install another Hadoop Cluster only for HBase to use the its HDFS.

So the 2 options are : one is an integrated Cluster , another is actually 2 clusters.

What's the pros & cons for these 2 options ?

Upvotes: 1

Views: 512

Answers (1)

zsxwing
zsxwing

Reputation: 20826

Option 1: An integrated cluster.

Pros: MapReduce which reads or writes HBase will more efficient as the data locality.

Cons: The HBase region server will reduce the performance of the machine (Datanode and TaskTracker) as it need to hold some CPU and memory. The HBase latency may be seconds if there are many MapReduce jobs. So if you want to make HBase response in time, you need more work (For example, using memcache to improve the read performance).

Option 2: 2 clusters.

Prons: The HBase region server will not impact the performance of the HDFS Datenode and the TaskTracker.

Cons: MapReduce needs to read and write the data remotely if it wants to access HBase. The option also needs more machines.

Upvotes: 1

Related Questions