Reputation: 2331
I use both Hive and MySQL (via Google Cloud SQL) and I want to use Presto to connect to both easily. I have seen there is a Presto initialization action for Cloud Dataproc but it does not work with Cloud SQL out of the box. How can I get that initialization action to work with Cloud SQL so I can use both Hive/Spark and Cloud SQL with Presto?
Upvotes: 0
Views: 522
Reputation: 2331
The easiest way to do this is to edit the initialization action installing Presto on the Cloud Dataproc cluster.
Cloud SQL setup
Before you do this, however, make sure to configure Cloud SQL so it will work with Presto. You will need to:
Changing the initialization action
In the Presto initialization action there is a section which sets up the Hive configuration and looks like this:
cat > presto-server-${PRESTO_VERSION}/etc/catalog/hive.properties <<EOF
connector.name=hive-hadoop2
hive.metastore.uri=thrift://localhost:9083
EOF
You can add a new section like this (below) which sets up the mysql properties. Add something like this:
cat > presto-server-${PRESTO_VERSION}/etc/catalog/mysql.properties <<EOF
connector.name=mysql
connection-url=jdbc:mysql://<ip_address>:3306
connection-user=<username>
connection-password=<password>
EOF
You will obviously want to replace <ip_address>
, <username>
, and <password>
with your correct values. Moreover, if you have multiple Cloud SQL instances to connect to, you can add multiple sections and give them different names, so long as the filename ends in .properties
.
Upvotes: 2