Developer Rajinikanth
Developer Rajinikanth

Reputation: 354

DataBricks Pyspark To Connect Azure DataBase using JDBC option

I am trying to connect my azure database from databricks pyspark below following code.

jdbcHostname = "xxxxxx.database.windows.net"
jdbcPort= 1433
jdbcDatabase = "DatabaseName"
jdbcUrl = "jdbc:sqlserver://{0}:{1}; database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties={
  "authentication":"ActiveDirectoryManagedIdentity",
  "driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
print(jdbcUrl)
query ="(SELECT * FROM DatabaseName.schema.tablename)"
domains = spark.read.jdbc(url = jdbcUrl, table = query, properties = connectionProperties)
display(domains)

Note : Assume that, I have downloaded mssql connector sqljdbc_12.2.0.0_enu version jar files and put it in databricks cluster.

Also, in databricks trying spark with python based notebook but we are getting error as below like and even if we use any other authentication, we are unable to connect azure sqldb.

Error:

com.microsoft.sqlserver.jdbc.SQLServerException: The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Upvotes: 1

Views: 529

Answers (1)

Alex Ott
Alex Ott

Reputation: 87259

The problem is that Databricks clusters don't have Managed Identity attached so you can't use it for authentication. And you really don't need to install JDBC driver explicitly - it should be included into Databricks runtime by default - you can check its availability with the following code in the notebook:

%scala

val cls = Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")

Upvotes: 0

Related Questions