Reputation: 357
I am trying to write data to Azure Synapse table with identity field with following code
code on databricks
def get_jdbc_connection(host, sqlDatabase, user, password):
jdbcHostname = "{}.database.windows.net".format(host)
jdbc_url = "jdbc:sqlserver://{}:1433;database={};user={}@{};password={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase, user, host, password)
url = "jdbc:sqlserver://{}:1433;database={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase)
return (jdbc_url,url )
def write_adw(spark, df_target_adw, jdbc_url, table, tempDir, option_mode, pre_Actions ):
df_target_adw.write.format("com.databricks.spark.sqldw") \
.option("url", jdbc_url) \
.option("useAzureMSI", "true") \
.option("preActions", pre_Actions) \
.option("dbTable", table) \
.option("tempDir", tempDir) \
.mode(option_mode) \
.save()
dftraffic = spark.sql('SELECT distinct SourceName\
,1 AS IsActiveRow \
,"Pipe-123" as pipelineId \
,current_timestamp as ADFCreatedDateTime \
,current_timestamp as ADFModifiedDateTime \
from deltaTable')
#write to ADW
(jdbc_url, url_adw) = get_jdbc_connection(host, sqlDatawarehouse,user, password)
target_table = 'TargetTable_name'
option_mode= "append"
pre_Actions= " SELECT GETDATE()"
write_adw(spark, dftraffic, jdbc_url, target_table, tempDir, option_mode, pre_Actions )
schema of target table on adw
Column Name | Data Type |
---|---|
SourceSID | INT IDENTITY (1,1) NOT NULL |
Source Name | VARCHAR(20) NOT NULL |
IsRowActive | BIT NOT NULL |
PipelineId | VARCHAR(20) NOT NULL |
ADFCreatedDateTime | DATETIME NOT NULL |
ADFModifiedDateTime | DATETIME NOT NULL |
Configuration details on databricks
Databricks runtime 7.4 (includes Apache Spark 3.0.1, Scala 2.12)
Error Message
Py4JJavaError: An error occurred while calling o457.save. : com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector. Underlying SQLException(s): - com.microsoft.sqlserver.jdbc.SQLServerException: An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON
the code was working all fine on databricks runtime 6.4 Spark 2.4.5 and I am facing this error as soon I tried to upgrade dbk runtime. How can I get this to work?
Upvotes: 1
Views: 816
Reputation: 1806
Are you not having the extra row "1 AS IsActiveRow " .I dont see that in the Schema
dftraffic = spark.sql('SELECT distinct SourceName\
,1 AS IsActiveRow \
,"Pipe-123" as pipelineId \
,current_timestamp as ADFCreatedDateTime \
,current_timestamp as ADFModifiedDateTime \
from deltaTable)
Upvotes: 0