Reputation: 71
I'm trying to run some insert query in a Azure SQL server in a Databricks script using pyspark. I know there is the spark native connector, but ,for my understanding, the option given are just to append or overwrite the table. I need to update the table and not every row of it are affected by the update.
I tried using the python pyodbc library but when making the connection with
cnx = pyodbc.connect('DRIVER={};SERVER={};DATABASE={};Trusted_Connection=yes;user={},password={} '.format(driver,jdbcHostname_dev,jdbcDatabase_dev,sql_user,sql_password))
i receive the following error
('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'SQL Server' : file not found (0) (SQLDriverConnect)")
I tried the answer marked as solution from using pyodbc in azure databrick for connecting with SQL server, but this not resolve the problem for me.
How can I use pyodbc in databricks?
Upvotes: 1
Views: 1509
Reputation: 2764
('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'SQL Server' : file not found (0) (SQLDriverConnect)")
The above error mainly happens because drivers are not properly installed.
I tried to reproduce the same in my environment and got the below results:
To resolve the above error. Please follow this approach:
Run the following command to install ODBC drivers on the Azure Databricks cluster.
%sh
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17
Now, you can check Azure Databricks connected to SQL server.
Code:
import pyodbc
server = '<Your_server_name>'
database = '<database_name>'
username = '<username>'
password = '<password>'
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=' + server + ';DATABASE='+ database +';UID=' + username + ';PWD='+ password)
Upvotes: 3