Amit khandelwal
Amit khandelwal

Reputation: 534

Spark HiveContext vs HbaseContext?

I have a data-set of size 10 Petabytes. My current data is in HBase where I am using Spark HbaseContext but it is not performing well.

Will it be useful to move data from HbaseContext to HiveContext on Spark?

Upvotes: 0

Views: 156

Answers (2)

kulssaka
kulssaka

Reputation: 226

In my use case, I use mapPartition with a HBase connection inside. The key is just to know how to split.

For scan, you can create your own scanner, with prefix, etc... For get it's even easier. For puts, you can create a list of puts to do then batch insertion.

I don't use any HBaseContext and I have quite good performances on database of 1,2 billion rows.

Upvotes: 0

Prashant
Prashant

Reputation: 772

HiveContext is used to read data from Hive. so, if you switch to HiveContext the data has to be in Hive. I don't think what you are trying will work.

Upvotes: 0

Related Questions