Reputation: 2016
As of October 2017, is it possible to write Sparklyr Spark Data Frames into SQL Server?
I got this error:
> DBI::dbWriteTable(con, "DZ_test", for_test)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘dbWriteTable’ for signature
‘"Microsoft SQL Server", "character", "tbl_spark"’
I'm currently running:
There is something similar for Cassandra in the wonderful crassy
package, but not sure if there is a connector or method for SQL Server. Thanks!
SOLVED!!!
I used the jdbc driver for SQL server. I had jdk 1.8, so that meant I used the driver for Java 8 on all nodes:
spark.jars.packages com.microsoft.sqlserver:mssql-jdbc:6.2.2.jre8
I put this in $SPARK_HOME/conf/spark-defaults.conf
.
I typically log into the SQL server using kerberos authentication, but since that was not set up on all nodes of my test cluster I made a SQL login for the database of interest (required an administrator) and connected via that username/pass.
spark_write_jdbc(my_sdf, "my_sdf",
options = list(
url=paste0("jdbc:sqlserver://cwjensql10.cwjwin.local;",
"databaseName=HER_NILM;",
"user=HER;",
"password=Test1;"),
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver")
)
If you want to use Kerberos authentication (and are running on linux) it would look as such:
spark_write_jdbc(my_sdf, "my_sdf",
options = list(
url=paste0("jdbc:sqlserver://cwjensql10.cwjwin.local;",
"databaseName=HER_NILM;",
"integratedSecurity=true;",
"authenticationScheme=JavaKerberos;"),
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver")
)
For Windows, just don't include the authenticationScheme
option.
Big thanks to @user6910411!!!!
Upvotes: 3
Views: 925
Reputation: 330203
You can use sparklyr::spark_write_jdbc
function to write data directly from Spark without collecting data in R. To make it work, you'll need an appropriate JDBC driver available on the driver and worker nodes.
Assuming driver is already included using spark.jars.packages
, you'll need something similar to:
spark_write_jdbc(
df, "some_name",
options=list(
url="jdbc:sqlserver://...",
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"))
Upvotes: 3