Ayan
Ayan

Reputation: 411

Databricks/Spark data write to SQL DW is dropping the table and recreating it

In Azure SQL DW ,I have an empty table (say table T1) .

Suppose T1 has 4 columns C1,C2,C3 and C4 (C4 is not null) I have a dataframe in Databricks (say df1) which has data for C1,C2 and C3

I am performing the write operation on the dataframe using code snippet like the following

df1.write
     .format("com.databricks.spark.sqldw")
     .option("url", jdbcURL)
     .option("dbtable", "T1")
     .option( "forward_spark_azure_storage_credentials","True")
     .option("tempDir", tempDir)
     .mode("overwrite")
     .save()

What I see there is that instead of getting any error ,the table T1 gets lost and new table T1 gets created with only 3 columns C1,C2 and C3. Is that an expected behavior or ideally while trying to insert data , some exceptions should have been thrown as data corresponding to C4 was missing ?

Upvotes: 0

Views: 906

Answers (1)

Stephen Darlington
Stephen Darlington

Reputation: 52565

You’ve set the mode to overwrite — dropping and recreating the table in question is my experience there too. Maybe try append instead?

Upvotes: 2

Related Questions