Reputation: 2331
I want to run Presto on a Dataproc instance or on Google Cloud Platform in general. How can I easily setup and install Presto, especially with Hive?
Upvotes: 0
Views: 1441
Reputation: 26538
There is an official tutorial now Use Presto with Google Cloud Dataproc. Essentially, you can
gcloud dataproc clusters create presto-cluster \
--project=${PROJECT} \
--zone=${ZONE} \
--num-workers=${WORKERS} \
--scopes=cloud-platform \
--initialization-actions=gs://dataproc-initialization-actions/presto/presto.sh
gcloud compute ssh presto-cluster-m \
--project=${PROJECT} \
--zone=${ZONE} \
-- -D 1080 -N
./presto-cli \
--server presto-cluster-m:8080 \
--socks-proxy localhost:1080 \
--catalog hive \
--schema default
Upvotes: 0
Reputation: 2331
You can use an initialization action with a Cloud Dataproc cluster to quickly install and configure Presto. Specifically, there is a GitHub repository with initialization actions. There is a Presto initialization action which lets you quickly install and configure Presto.
If you want to use the Presto WebUI, once the cluster is online you can follow these directions to create an SSH tunnel and SOCKS proxy to the cluster. From there, you can access Presto (by default, unless you change it) on port 8080
on the master node.
Upvotes: 1