Reputation: 375
I want to ask how can I connect the SQL Server using Windows Authentication, with pyspark library? I can connect with Microsoft SQL Server Management Studio but not when I try to code in Python with a spark.Here's what I tried so far.
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName("Python Spark SQL basic example") \
.config("spark.driver.extraClassPath","mssql-jdbc-6.4.0.jre8.jar") \
.getOrCreate()
mssql_df = spark.read.format("jdbc") \
.option("url", "jdbc:sqlserver://localhost:1433;databaseName=DATABASE-NAME") \
.option("dbtable", "database-table-name") \
.option("user", "Windows-Username") \
.option("password", "Windows-Pass")\
.option("driver", 'com.mysql.jdbc.Driver').load()
mssql_df.printSchema()
mssql_df.show()
Upvotes: 3
Views: 11382
Reputation: 7336
As shown here you can set the integratedSecurity=true
to connect to SQL Server via jdbc and Windows Authentication.
Then Spark configuration it should look as next:
mssql_df = spark.read.format("jdbc") \
.option("url", "jdbc:sqlserver://localhost:1433;databaseName=DATABASE-NAME;integratedSecurity=true") \
.option("dbtable", "database-table-name") \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.load()
UPDATE:
As discussed in the comments user should place sqljdbc_auth.dll
in the same folder where mssql-jdbc-7.4.1.jre12.jar
lives or just set spark.driver.extraClassPath
for both jars seperated by : as shown below:
.config("spark.driver.extraClassPath","/path.to/mssql-jdbc-6.4.0.jre8.jar:/path/to/sqljdbc_auth.dll")
sqljdbc_auth.dll is part of the Microsoft JDBC Driver 6.0 for SQL Server and you can download it from here. Alternatively you can just install JDBC driver on your system and specify the path where the dll is stored.
Upvotes: 3