Augustine
Augustine

Reputation: 106

Spark dataframe returning only structure when connected to Phoenix query server

I am connecting to hbase ( ver 1.2) via phoenix (4.11) queryserver from Spark 2.2.0, but the dataframe is returning the only table structure with empty rows thoug data is present in table. Here is the code I am using to connect to queryserver.

// ---jar ----phoenix-4.11.0-HBase-1.2-thin-client.jar<br>
val prop = new java.util.Properties
prop.setProperty("driver", "org.apache.phoenix.queryserver.client.Driver")
val url = "jdbc:phoenix:thin:url=http://localhost:8765;serialization=PROTOBUF"
val d1 = spark.sqlContext.read.jdbc(url,"TABLE1",prop) 
d1.show()

Can anyone please help me in solving this issue. Thanks in advance

Upvotes: 1

Views: 591

Answers (2)

fylb
fylb

Reputation: 699

Well, it's an old question, but I just stumbled over the same problem. I had to set the "fetchsize" property to get results:

    prop.put("fetchsize", "1000")

Upvotes: 0

WonderChild
WonderChild

Reputation: 70

If you are using spark2.2 the better approach would be to load directly via pheonix as a dataframe.This way you would provide the zookeeper url only and you can provide a predicate so that you load only the data required and not the entire data.

import org.apache.phoenix.spark._
import org.apache.hadoop.conf.Configuration
import org.apache.spark.sql.SparkSession

val configuration = new Configuration()
configuration.set("hbase.zookeeper.quorum", "localhost:2181");
val spark = SparkSession.builder().master("local").enableHiveSupport().getOrCreate()
val df=spark.sqlContext.phoenixTableAsDataFrame("TABLE1",Seq("COL1","COL2"),predicate = Some("\"COL1\" = 1"),conf = configuration)

Read this for more info on getting table as rdd and saving dataframes and rdd's .

Upvotes: 0

Related Questions