Reputation: 107
I have recieved a requirement. The data is incrementally copied to Bronze layer live table. Once the data is in bronze layer need to apply the data quality checks and final data need to be loaded into silver live table. I don’t have idea on this.
Could anyone please help me how to write the code using PySpark in databricks
Upvotes: 1
Views: 984
Reputation: 87279
You need to follow the DLT Python tutorial.
@dlt.table
def bronze():
df = spark.readStream.format("cloudFiles")...load(input_path)
@dlt.table
@dlt.expect_or_drop("col1_not_null", "col1 is not null")
def silver():
df = dlt.read_stream("bronze")
Upvotes: 1
Reputation: 46
you can refer the databricks documentation as the task seems to be basic.
For ingestion into bronze layer - Autoloader
For bronze layer to silver layer(applying constraints)-https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/expectations
Upvotes: 0