Reputation: 354
I am trying to connect my azure database from databricks pyspark below following code.
jdbcHostname = "xxxxxx.database.windows.net"
jdbcPort= 1433
jdbcDatabase = "DatabaseName"
jdbcUrl = "jdbc:sqlserver://{0}:{1}; database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties={
"authentication":"ActiveDirectoryManagedIdentity",
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
print(jdbcUrl)
query ="(SELECT * FROM DatabaseName.schema.tablename)"
domains = spark.read.jdbc(url = jdbcUrl, table = query, properties = connectionProperties)
display(domains)
Note : Assume that, I have downloaded mssql connector sqljdbc_12.2.0.0_enu version jar files and put it in databricks cluster.
Also, in databricks trying spark with python based notebook but we are getting error as below like and even if we use any other authentication, we are unable to connect azure sqldb.
Error:
com.microsoft.sqlserver.jdbc.SQLServerException: The authentication value "ActiveDirectoryManagedIdentity" is not valid.
Upvotes: 1
Views: 529
Reputation: 87259
The problem is that Databricks clusters don't have Managed Identity attached so you can't use it for authentication. And you really don't need to install JDBC driver explicitly - it should be included into Databricks runtime by default - you can check its availability with the following code in the notebook:
%scala
val cls = Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")
Upvotes: 0