Reputation: 253
I am very new to Google Cloud Platform, and I am doing a POC for moving a hive application (tables and jobs) to Google Dataproc. The data has already been moved to Google cloud Storage.
Is there an inbuilt way to create all the tables from hive in dataproc in bulk, instead of creating one by one using the hive prompt?
Upvotes: 2
Views: 410
Reputation: 26458
Dataproc support Hive job type, so you can use the gcloud command:
gcloud dataproc jobs submit hive --cluster=CLUSTER \
-e 'create table t1 (id int, name string); create table t2 ...;'
or
gcloud dataproc jobs submit hive --cluster=CLUSTER -f create_tables.hql
You can also SSH into the master node, then use beeline to execute the script:
beeline -u jdbc:hive2://localhost:10000 -f create_tables.hql
Upvotes: 1