Daniella
Daniella

Reputation: 483

How to run hive on google cloud dataproc from within the machine?

I've just created a google cloud dataproc cluster. A few basic things are not working for me:

  1. I'm trying to run the hive console from the master node but it fails to load with any user other than root (it looks like there's a lock, the console is just stuck).

  2. But even when using root, I see some odd behaviour:

    • "show tables;" shows a table named "input"
    • querying the table raises an exception that this table not found.
  3. It is not clear which user is creating the tables through the web ui. I create a job, execute it, but then don't see the results through the console.

Couldn't find any good documentation on that - does anybody have an idea on this?

Upvotes: 1

Views: 4277

Answers (2)

ashish
ashish

Reputation: 11

This thread is a bit old but when some one search Google Cloud Platform and Hive this result is coming. So I'm adding some info which may be useful.

Currently, in order to submit job to Google dataproc, I think - like all other products - there are 3 options:

  1. from UI

  2. from console using command line like: gcloud dataproc jobs submit hive --cluster=CLUSTER (--execute=QUERY, -e QUERY | --file=FILE, -f FILE) [--async] [--bucket=BUCKET] [--continue-on-failure] [--jars=[JAR,…]] [--labels=[KEY=VALUE,…]] [--params=[PARAM=VALUE,…]] [--properties=[PROPERTY=VALUE,…]] [GLOBAL-FLAG …]

  3. REST API call like: https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/submit

Hope this will be useful to someone.

Upvotes: 0

Patrick Clay
Patrick Clay

Reputation: 1349

Running the hive command at present is somewhat broken due to the default metastore configuration.

I recommend you use the beeline client instead, which talks to the same Hive Server 2 as Dataproc Hive Jobs. You can use it via ssh by running beeline -u jdbc:hive2://localhost:10000 on the master.

YARN applications are submitted by the Hive Server 2 as the user "nobody", you can specify a different user by passing the -n flag to beeline, but it shouldn't matter with default permissions.

Upvotes: 3

Related Questions