Lokesh
Lokesh

Reputation: 87

delta writestream .option("mergeSchema", "true") issue

I have a delta table of 3 columns with data. Now, I have an incoming data with 4 columns so the DF.writeStream has to update the data location atleast with 4 columns automatically, so we can recreate the table on the top of the data location. hence the old records will have nulls in the newly added columns, and the recent data will have 4 columns populated

eg:
id  name   addr             id  name   addr   phone 
1   lok    UK      ---->     1  lok    UK     null
                             2  ram    US     +1234

but while I use the following command as per databricks wesite, it shows,

 df.writeStream
      .option("mergeSchema", "true")
      .format("delta")
      .outputMode("append")
      .option("path","/data/")
      .option("checkpointLocation","/checkpoint/")
      .start()
      .awaitTermination()

ERROR: A schema mismatch detected when writing to the Delta table
To enable schema migration, please set:
'.option("mergeSchema", "true")'.

But I am already using mergeSchema in options. Please advice.. NOTE: .saveAsTable or .table functions are also not allowed in writeStream

Upvotes: 0

Views: 6002

Answers (2)

atom
atom

Reputation: 21

You probably need to change the checkpoint location.

For details, see the document here .

Upvotes: 2

rishabh srivastava
rishabh srivastava

Reputation: 45

You can try using foreach batch and then write the batch as delta format. This works for me.

Upvotes: 0

Related Questions