Reputation: 395
I want to retrieve the data from a MS SQL database (not hosted on Azure) to Microsoft Azure Databricks notebook. Here are the steps of what I have done:
msSqlServer = "jdbc:sqlserver://xxx.xxx.xxx.xxx:1433;ApplicationIntent=readonly;databaseName=" + msSqlDatabase
query = """(select * from mytable)foo"""
df = (
spark.read.format("jdbc")
.option("url", msSqlServer)
.option("dbtable", query)
.option("user", msSqlUser)
.option("password", msSqlPassword)
.load()
)
and I get this error:
com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host xxx.xxx.xxx.xxx, port 1433 has failed. Error: "connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".
Before asking on StackoverFlow, I have contacted my company network and DBA teams. The DBA said the connection is okay but then disconnects immediatly"
For your information, I have followed this tutorial https://learn.microsoft.com/en-us/azure/databricks/data/data-sources/sql-databases
Maybe there is something to configure, but I am not in networking at all (I am just a little data scientist who wants to play with notebook on azure databricks and access his company databases). For example how can I Make sure that TCP connections to the port are not blocked by a firewall
?
If you have some idea or you have already met this issue, feel free to share. :)
If you need more information, please tell me.
Upvotes: 0
Views: 1272
Reputation: 12788
If you have already configured your Azure SQL database to listen on TCP/IP traffic on port 1433 then it could be any of following three reasons:
From Azure Portal get the Azure SQL Database JDBC connection string.
SQL Databases using JDBC using Python:
jdbcHostname = "chepra.database.windows.net"
jdbcDatabase = "chepra"
jdbcPort = "1433"
username = "chepra"
password = "XXXXXXXXXX"
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"user" : username,
"password" : password,
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
pushdown_query = "(Select * from customers where CustomerID = 2) CustomerID"
df = spark.read.jdbc(url=jdbcUrl, table=pushdown_query, properties=connectionProperties)
display(df)
SQL Databases using JDBC using Scala:
val jdbcHostname = "chepra.database.windows.net"
val jdbcPort = 1433
val jdbcDatabase = "chepra"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}"
// Create a Properties() object to hold the parameters.
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", s"chepra")
connectionProperties.put("password", s"XXXXXXXXXX")
val employees_table = spark.read.jdbc(jdbcUrl, "customers", connectionProperties)
employees_table.show()
Upvotes: 1