pkscoder
pkscoder

Reputation: 41

Connection to spark and accessing hive table without thrift server

I am writing a Java Spark application that needs to connect to hive and get some basic table info and query that table for data. I am creating a spark session and getting info like below. But this uses thrift server. I want to see if I can do the same without using thrift server. Is that possible and how do I do it? I am trying to write a JDBC client that that can connect to spark via sparkSQL to access hive tables but without using thrift server. Please provide your thoughts and suggestions on how to approach this. Thank you.

SparkSession spark = SparkSession
              .builder()
              .appName(" Hive example")
              .enableHiveSupport()
              .getOrCreate();

           Dataset<Row> df = spark.read()
              .format("jdbc")
               .option("driver", "org.apache.hive.jdbc.HiveDriver")
              .option("url", " jdbc:hive2://host:port")
              .option("dbtable", "mytable")
              .option("fetchsize", "20")
              .load();
        df.show();

Upvotes: 1

Views: 1550

Answers (1)

Chitral Verma
Chitral Verma

Reputation: 2855

With Spark 2 you can try something like this,

SparkSession ss = SparkSession
.builder()
.appName(" Hive example")
.config("hive.metastore.uris", "thrift://localhost:9083")
.enableHiveSupport()
.getOrCreate();

Note the hive.metastore.uris property, change localhost to point to you sandbox or cluster.

one ss is initialised, you can read tables like below,

val df = ss.read.table("db_name.table_name")

JDBC way:

spark.read
    .format("jdbc")
    .option("url", "jdbc:hive2://localhost:10000/default")
    .option("dbtable", "clicks_json")
    .load()

Hope this helps. Cheers.

Upvotes: 1

Related Questions