Reputation: 142
Is it possible to create an Slowly Changing Dimension mechanism using Delta Live Tables? I would like to implement something like this https://docs.databricks.com/_static/notebooks/merge-in-scd-type-2.html
But in the DLT docs i found "Processing updates from source tables, for example, merges and deletes, is not supported. To process updates, see the APPLY CHANGES INTO command" - but https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-cdc.html it's only for CDC, do you have any idea for SCD?
Upvotes: 2
Views: 2605
Reputation: 279
Merge is supported in DLT , please see the sample code below
mergeDF1
.as("merge_tbl1")
.merge(mergeDF2.as("merge_tbl2"), "merge_tbl1.key = merge_tbl2.key")
.whenMatched("merge_tbl1.isCurrent = true AND (merge_tbl2.country <> merge_tbl1.country OR merge_tbl2.region <> merge_tbl1.region)")
.updateExpr(Map(
"isCurrent" -> "false",
"endDate" -> "<date>",
.whenNotMatched()
.insertExpr(Map(
"personId" -> "<personId>",
"personName" -> "<personName>",
"country" -> "<country>",
"region" -> "<region>,
"isCurrent" -> "true",
"effectiveDate" -> "<date>",
"endDate" -> "null"))
.execute()
Upvotes: 0
Reputation: 87119
As you noticed right now DLT supports only SCD Type 1 (CDC). Support for SCD Type 2 is currently in the private preview, and should be available in near future - refer to the Databricks Q2 public roadmap for more details on it. If you have solutions architect or customer success engineer in your account, ask them to include you into private preview.
Update, September 2022: SCD type 2 is generally available since June 2022: https://www.databricks.com/blog/2022/06/29/delta-live-tables-announces-new-capabilities-and-performance-optimizations.html
Upvotes: 4
Reputation: 355
DLT is now supporting SCD Type 2 in public preview. You can learn about it and see examples in this document
Upvotes: 2