Reputation: 791
We have a hbase-0.94
cluster with hadoop-1.0.1
. We don't want to have downtime for this cluster while upgrading to hbase-0.98
with hadoop-2.5.1
I have provisioned another hbase-0.98 cluster with hadoop-2.5.1 and want to copy hbase-0.94 tables to hbase-0.98. Hbase
CopyTable
does not seem to work for this purpose.
Please suggest a way to perform theabove task.
Upvotes: 2
Views: 1567
Reputation: 881
Run below command on source cluster, make sure you have cross cluster authentication enabled.
/usr/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable -Ddfs.nameservices=nameservice1,devnameservice -Ddfs.ha.namenodes.devnameservice=devnn1,devnn2 -Ddfs.namenode.rpc-address.devnameservice.devnn1=<destination_namenode01_host>:<destination_namenode01_port> -Ddfs.namenode.rpc-address.devnameservice.devnn2=<destination_namenode02_host>:<destination_namenode02_port> -Ddfs.client.failover.proxy.provider.devnameservice=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider -Dmapred.map.tasks.speculative.execution=false --peer.adr=<destination_zookeeper host>:<port>:/hbase --versions=<n> <table_name>
Upvotes: 0
Reputation: 29237
These are available options, out of which you can choose.
org.apache.hadoop.hbase.mapreduce.Export
tool to
export tables to HDFS and then you can use hadoop distcp
to move data to
another cluster. When data is place on second cluster you can use
org.apache.hadoop.hbase.mapreduce.Import
tool to import tables.
Please look at http://hbase.apache.org/book.html#export.Second option is to us CopyTable
tool, please look at:
http://hbase.apache.org/book.html#copytable
Have a look at pivotal
Third option is to enable hbase Snapshots, create table
snapshots, and then use ExportSnapshot
tool to move them to second cluster. When snapshots are on second cluster you can clone tables from snapshots. Please look: http://hbase.apache.org/book.html#ops.snapshots
HBase Snapshots allow you to take a snapshot of a table without too much impact on Region Servers. Snapshot, Clone and restore operations don't involve data copying. Also, Exporting the snapshot to another cluster doesn't have impact on the Region Servers
I was using 1 and 3 for moving data between clusters and I in my case 3 was better solution.
Also, have a look at my answer posted
Upvotes: 1