Reputation: 149
Say I created an external Delta table with the following Schema
I inserted some data and then, for whatever reason, I decided to drop the metadata whilst retaining the data.
When I try to recreate the table, I get a Schema mismatch error which says product_dsc is a STRING.
This has been wracking my brain for days.
I've read a bit about schema enforcing and evolution in Delta Lake, although I'm more confused now than when I started.
Another strange thing is that I'm unable to cast a spark dataframe column to VARCHAR. I've tried that while troubleshooting the aforementioned problem, to no avail. It seems spark does not want me to use the VARCHAR data type at all. This is bugging me because during all my professional life I've used VARCHAR(max length) data types and now I can't do it anymore. It's unnatural to me.
Thanks in advance.
Upvotes: 0
Views: 5955
Reputation: 2729
I try to reproduce the same thing in my environment. I got the same error. Sometimes columns in your table are different from the columns that you have in your data frame. Because of that, you are getting the schema mismatch error.
Note: string | char() | varchar() -> StringType
I created a delta
table with the name emp_dem
.
Table 1:
Created the same schema of another table just add one column name as additional column
.
Now, if I insert a newly created schema into the delta table, I got the same error. This is the reason, if columns in your table are different from the columns that you have inserted in your data frame then you got this error.
To resolve this error, you need to use the merge operation.
df.write.option("mergeSchema","true").format("delta").mode("append").saveAsTable("emp_dem")
For more information refer this link.
Upvotes: 1