Sebastian Auberger
Sebastian Auberger

Reputation: 480

Can't connect Cloud Data Fusion with Google Cloud SQL for PostgreSQL

My goal is to read data from Cloud SQL Postgres to BigQuery via a Cloud Data Fusion pipeline.

For this, I set up a Cloud Data Fusion instance and assigned the following two permissions to the service account: (see https://cloud.google.com/data-fusion/docs/how-to/create-instance#setting_up_permissions)

As a next step I connected myself to the Cloud Data Fusion Instance, and navigated to Wrangler -> Add Connection -> Database -> Google Cloud SQL for PostgreSQL.

As the driver I uploaded the postgres-socket-factory-1.0.13-jar-with-dependencies.jar which I downloaded here: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory

For the driver configuration, I set:

For the database connection, I set:

After clicking on Test Connection, I receive the org.postgresql.Driver error message.

enter image description here

Upvotes: 7

Views: 4463

Answers (3)

xgMz
xgMz

Reputation: 3354

On a related note, if the Data Fusion is private, then to reach the Cloud SQL instance through a private interface (no public IP traffic), you'll need a "Proxy VM", similar to what's described in this doc, but with a different startup-script, this blog post has a good overview on how to achieve this.

Upvotes: 0

Rafael Aguilar
Rafael Aguilar

Reputation: 3279

@Sebastian Auberger,

We had the same issue, basically Data Fusion creates a Service Account which tries to create a IAP Tunnel to the Cloud SQL Instance. We could get through this issue by giving the role "Cloud SQL Client".

Give it a shot and let us know!.

Upvotes: 1

kurtisvg
kurtisvg

Reputation: 3565

Most likely you need a driver and the JDBC Socket Factory to get it to work properly. You should bundle the postgres connector with the socket factory into a uber-jar and see if that solves the problem.

Here is a quick (untested) pom that should accomplish this:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
   <modelVersion>4.0.0</modelVersion>
   <groupId>com.example</groupId>
   <artifactId>postgres-with-cloud-sql-socket-factory</artifactId>
   <packaging>pom</packaging>
   <version>0.0.1</version>

   <dependencies>
      <dependency>
         <groupId>org.postgresql</groupId>
         <artifactId>postgresql</artifactId>
         <version>42.2.5</version>
      </dependency>
      <dependency>
         <groupId>com.google.cloud.sql</groupId>
         <artifactId>postgres-socket-factory</artifactId>
         <version>1.0.13</version>
      </dependency>
   </dependencies>

   <build>
      <plugins>
         <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.4.1</version>
            <configuration>
               <!-- get all project dependencies -->
               <descriptorRefs>
                  <descriptorRef>jar-with-dependencies</descriptorRef>
               </descriptorRefs>
            </configuration>
            <executions>
               <execution>
                  <id>make-assembly</id>
                  <!-- bind to the packaging phase -->
                  <phase>package</phase>
                  <goals>
                     <goal>single</goal>
                  </goals>
               </execution>
            </executions>
         </plugin>
      </plugins>
   </build>
</project>

Upvotes: 4

Related Questions