Reputation: 1518
I have created 2 data proc cluster.The requirement is to use 1 hive meta store and both the cluster can access. First one is ETL cluster which has --scopes=sql-admin and 2nd one for ML users --scopes=cloud-platform .The Data base and tables created using ETL clusters are not accessed by ML cluster. Can any one help if I have to add --scopes=sql-admin in each cluster.
ETL Cluster Create Command:
gcloud dataproc clusters create amlgcbuatbi-report \
> --project=${PROJECT} \
> --master-machine-type n1-standard-1 --worker-machine-type n1-standard-1 --master-boot-disk-size 50 --worker-boot-disk-size 50 \
> --zone=${ZONE} \
> --num-workers=${WORKERS} \
> --scopes=sql-admin \
> --image-version=1.3 \
> --initialization-actions=gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
> --properties=hive:hive.metastore.warehouse.dir=gs://gftat/data \
> --metadata="hive-metastore-instance=$PROJECT:$REGION:metaore-dev001"
Output:
0: jdbc:hive2://localhost:10000/default> show databases;
+------------------+
| database_name |
+------------------+
| default |
| gcb_dw |
| l1_gcb_trxn_raw |
+------------------+
ML Cluster create command:
gcloud dataproc clusters create amlgcbuatbi-ml \
> --project=${PROJECT} \
> --master-machine-type n1-standard-1 --worker-machine-type n1-standard-1 --master-boot-disk-size 50 --worker-boot-disk-size 50 \
> --zone=${ZONE} \
> --num-workers=${WORKERS} \
> --scopes=cloud-platform \
> --image-version=1.3 \
> --optional-components=PRESTO \
> --initialization-actions=gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
> --initialization-actions=gs://dataproc-initialization-actions/presto/presto.sh \
> --metadata="hive-metastore-instance=$PROJECT:$REGION:metaore-dev001"
Output: Here I am not able to see the DB and tables.
0: jdbc:hive2://localhost:10000/default> show databases;
+----------------+
| database_name |
+----------------+
| default |
+----------------+
Upvotes: 1
Views: 1271
Reputation: 10677
The --initialization-actions
flag requires a comma-separated list rather than repeating the flag to append multiple initialization actions to the list. Try
--initialization-actions=gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh,gs://dataproc-initialization-actions/presto/presto.sh
Instead of two separate --initialization-actions
flags.
Upvotes: 2