Reputation: 35
I have a table in my database called products and has prouductId, ProductName, BrandId and BrandName. I need to create delta tables for each brands by passing brand id as parameter and the table name should be corresponding .delta. Every time when new data is inserted into products (master table) the data in brand tables need to be truncated and reloaded into brand.delta tables. Could you please let me know if this is possible within databricks using spark or dynamic SQL?
Upvotes: 1
Views: 683
Reputation: 87259
It's easy to do, really there are few variants:
.saveAsTable
in the overwrite
mode:df = spark.read.table("products")
... transform df
brand_table_name = "brand1"
df.write.mode("overwrite").saveAsTable(brand_table_name)
spark.sql
to substitute variables in this text):CREATE OR REPLACE TABLE brand1
USING delta
AS SELECT * FROM products where .... filter condition
for list of brands you just need to use spark.sql
with loop:
for brand in brands:
spark.sql(f"""CREATE OR REPLACE TABLE {brand}
USING delta
AS SELECT * FROM products where .... filter condition""")
P.S. Really, I think that you just need to define views (doc) over the products
table, that will have corresponding condition - in this case you avoid data duplication, and don't incur computing costs for that writes.
Upvotes: 1