Reputation: 762
I am trying to write pyspark dataframe to Azure Postgres Citus (Hyperscale). I am using latest Postgres JDBC Driver and I tried writing on Databricks Runtime 7,6,5.
df.write.format("jdbc").option("url","jdbc:postgresql://<HOST>:5432/citus?user=citus&password=<PWD>&sslmode=require" ).option("dbTable", table_name).mode(method).save()
This is what I get after running the above command
org.postgresql.util.PSQLException: SSL error: Received fatal alert: handshake_failure
I have already tried different parameters in the URL and unders the option as well, but no luck so far. However, I am able to connect to this instance using my local machine and on databricks driver/notebook using psycopg2 Both the Azure Postgres Citus and Databricks are in the same region and Azure Postgres Citus is public.
Upvotes: 3
Views: 7377
Reputation: 762
It worked by overwriting the java security properties for driver and executor
spark.driver.extraJavaOptions -Djava.security.properties=
spark.executor.extraJavaOptions -Djava.security.properties=
Explanation:
What is happening in reality is that the “security” variable of the JVM is reading by default the following file (/databricks/spark/dbconf/java/extra.security) and in this file there are some TLS algorithms that are being disabled by default. That means that if I edit this file and replace the TLS cyphers that work for PostGres citus for an empty string that should also work.
When I set this variable to the executors (spark.executor.extraJavaOptions) it will not change the default variables from the JVM. The same does not happen for the driver which overwrites and so it starts to work.
Note: We need to edit this file before the variable is read and so the init script is the only way of accomplishing that.
Upvotes: 8