Reputation: 333
I'm trying to reduce unnecessary writing of data and only write to the delta lake under a specific condition. Why do these statements always rewrite the data?
%sql
MERGE INTO tblTest as target
USING temp_Source as source
ON target.ID = source.ID
WHEN MATCHED AND 1 = 0
THEN UPDATE SET *
or this
deltaTable.alias("target").merge(
source = dfSource.alias("source"),
condition = expr("source.ID = target.ID")) \
.whenMatchedUpdateAll('1 = 0') \
.execute()
I'm expecting that only table metadata would be updated and no data from the source would be written to the target.
Upvotes: 2
Views: 2549
Reputation: 87144
That's a known behavior of the Delta - it rewrites every file that hase matching record in the ON
clause, regardless of the condition for WHEN MATCHED
/ WHEN NOT MATCHED
. If you want to avoid this, move your condition into the ON
clause.
Upvotes: 2