Bruno
Bruno

Reputation: 53

Why can't I connect to Hive metastore?

So, I'm using gcloud dataproc, Hive and Spark on my project but I can't connect to Hive metastore apparently.

I have the tables populated correctly and all the data is there, for example the table that I'm trying to access now is the next on the image and as you can see the parquet file is there (stores as parquet). Sparktp2-m is the master of the dataproc cluster.

enter image description here

Next, I have a project on IntelliJ that will have some queries on it but first I need to access this hive data and it's not going well. I'm trying to access it like this:

SparkSession spark = SparkSession
            .builder()
            .appName("Check")
            .config("hive.metastore.uris","thrift://hive-metastore:9083")
            .enableHiveSupport()
            .getOrCreate();

    JavaPairRDD<Tuple2<Object, String>, Integer> mr = spark.table("title_basics_parquet").toJavaRDD()...

And next, I build the jar and send it as a job like this:

gcloud dataproc jobs submit spark --jars target/GGCD_Spark-1.0-SNAPSHOT.jar --class parte1.Queries --cluster sparktp2 --region europe-west1

And the error is:

enter image description here

Am I missing something, or is it the wrong URI?

Upvotes: 2

Views: 1799

Answers (1)

Dagang Wei
Dagang Wei

Reputation: 26458

The default Hive Metastore thrift://<master-node-hostname>:9083.

Upvotes: 1

Related Questions