nxverma
nxverma

Reputation: 3

Spark HBase to Google Dataproc and Bigtable migration

I have HBase Spark job running at AWS EMR cluster. Recently we moved to GCP. I transferred all HBase data to BigTable. Now I am running same Spark - Java/Scala job in Dataproc. Spark job failing as it is looking spark.hbase.zookeeper.quorum setting.

Please let me know, how without code change I can make my spark job to run successfully with BigTable.

Regards, Neeraj Verma

Upvotes: 0

Views: 1001

Answers (1)

chemikadze
chemikadze

Reputation: 815

While BigTable shares same principles and same Java API is available as HBase, it is not sharing its wire protocol. So standard HBase Client won't work (zookeeper error looks like you are trying to connect to BigTable via HBase client). Instead, you need to modify your program to use BigTable-specific client. It implements same Java interfaces as HBase, but requires custom google jars in classpath and few property overrides to enable it.

Upvotes: 1

Related Questions