Masterbuilder
Masterbuilder

Reputation: 509

HBase batch get with spark scala

I am trying to fetch data from HBase based on a list of row keys, in the API document there is a method called get(List gets), I am trying to use that, however the compiler is complaining something like this, does anyone had this experiance

overloaded method value get with alternatives: (x$1: java.util.List[org.apache.hadoop.hbase.client.Get])Array[org.apache.hadoop.hbase.client.Result] <and> (x$1: org.apache.hadoop.hbase.client.Get)org.apache.hadoop.hbase.client.Result cannot be applied to (List[org.apache.hadoop.hbase.client.Get])

The code I tried.

val keys: List[String] = df.select("id").rdd.map(r => r.getString(0)).collect.toList
   val gets:List[Get]=keys.map(x=> new Get(Bytes.toBytes(x)))
   val results = hTable.get(gets)

Upvotes: 0

Views: 1172

Answers (2)

Masterbuilder
Masterbuilder

Reputation: 509

I ended up using JavaConvert to make it java.util.List, then it worked

val gets:List[Get]=keys.map(x=> new Get(Bytes.toBytes(x)))
   import scala.collection.JavaConverters._
   val getJ=gets.asJava
   val results = hTable.get(getJ).toList

Upvotes: 1

Deepak Janyavula
Deepak Janyavula

Reputation: 368

your gets is of type List[Get]. Here List is of Scala type. However, HBase get request expects Java List type. You can use Seq[Get] instead of List[Get] as Scala Seq is more closer to Java List. So, you can try with below code:

val keys: List[String] = df.select("id").rdd.map(r => r.getString(0)).collect.toList
   val gets:Seq[Get]=keys.map(x=> new Get(Bytes.toBytes(x)))
   val results = hTable.get(gets)

Upvotes: 0

Related Questions