Xavier D
Xavier D

Reputation: 3821

Can a private Cloud data fusion connect to the internet?

Our application is made of a spring-bot app server deployed through "cloud run" and a "cloud sql postgres" database.

The database is private and connected to a private VPC .
The app server can connect to the database through a gateway to this private VPC provided by the "cloud run" configuration.

We'd like to feed this database with "cloud data fusion" (CDF) periodically. CDF should fetch data from AWS S3 and push it into our database.

We've designed and validated a pipeline for that purpose but we're facing a network paradox :

How can CDF both write to the private database and read data from the internet ?
I'm surprised that a CDF instance, even being private, can't establish an EGRES connection to an internet resouce.

Upvotes: 0

Views: 798

Answers (1)

guillaume blaquiere
guillaume blaquiere

Reputation: 75715

Cloud Data fusion is a tool that help you to build pipeline (based on CDAP). If you set the Data Fusion private, it's the access to the tool that is private, not the runtime! On Google Cloud, the pipeline runs on Dataproc cluster.

So now, the question is: Can your Dataproc cluster reach internet and your database?

  1. If your cluster run in the same VPC as your Cloud SQL database private IP connection, and there is no firewall rule that prevent the communication, it's OK
  2. If your Compute Engines that compose your cluster have public IP, no problem, you can access to public URL. Else, as said by John Hanley, you can create a Cloud NAT to allow your Compute Engine to initiate call to external URL.

Upvotes: 1

Related Questions