trigonocephalus
trigonocephalus

Reputation: 23

Truncate tables on databricks

I'm working with two environments in Azure: Databricks and SQL Database. I'm working with a function that generate a dataframe that it's going to be used to overwrite the table that is stored in the SQL Database. I have many problems because the df.write.jdbc(mode = 'overwrite') only drops the table and, I'm guessing, my user didn't have the right permissions to created again (I've already seen for DML and DDL permission that I need to do that). In resume, my functions only drops the table but without recreating again.

We discuss about what could be the problem and we conclude that maybe the best thing that I can do is truncate the table and re-add the new data there. I'm trying to find how to truncate the table, I tried these two approaches but I can't find more information related to that:

df.write.jdbc()

&

spark.read.jdbc()

Can you help me with these? The overwrite doesn't work (maybe I don't have the adequate permissions) and I can't figure out how to truncate that table using a jdbc.

Upvotes: 2

Views: 5928

Answers (2)

Alex Ott
Alex Ott

Reputation: 87069

It's in the Spark documentation - you need to add the truncate when writing:

df.write.mode("overwrite").option("truncate", "true")....save()

Also, if you have a lot of data, then maybe it's better to use Microsoft's Spark connector for SQL Server - it has some performance optimizations that should allow to write faster.

Upvotes: 2

nachi_rajput
nachi_rajput

Reputation: 11

You can create stored procedure for truncating or dropping in SQL Server and call that stored procedure in databricks using ODBC connection.

Upvotes: 1

Related Questions