Reputation: 57
Background: We have working pipelines in a datafusion instance(version 6.4.1 and runs on a dataproc cluster) which connects to the postgres cloudsql. Everything is working in this instance.
Issue: We created a new datafusion instance(version 6.7.1) on a newly created dataproc cluster. We installed the necessary artifacts (CloudSQL PostgreSQL JDBC Driver, CloudSQL PostgreSQL Plugins), and added the database connection in the namespace admin page(CDF GUI) and successfully tested the connection(so, we can connect to the cloudsql instance ?). But when deployed a pipeline(with the same credentials as that used in the connection) and run it, we encounter the following error:
Exception while trying to validate schema of database table "<table_name>" for connection jdbc:postgresql:///<db_name>?cloudSqlInstance=<cloud-sql-instance-name>&socketFactory=com.google.cloud.sql.postgres.SocketFactory.
This is the error from the raw logs:
PSQLException : “Connection to :5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.”
Both the older(6.4.1) and the new(6.7.1) DF instances are almost identical(use the same service account, dataproc service account, point to the same cloud sql instance)
Any suggestion is appreciated
Upvotes: 0
Views: 320
Reputation: 46
It maybe worth checking a few things
For the setup of the new cluster, please make sure this has been followed https://datafusion.atlassian.net/wiki/spaces/KB/pages/32276578/Configurations+for+a+static+Dataproc+cluster
Upvotes: 2