How to create many tables programatically?

Question

I have a table in my database called products and has prouductId, ProductName, BrandId and BrandName. I need to create delta tables for each brands by passing brand id as parameter and the table name should be corresponding .delta. Every time when new data is inserted into products (master table) the data in brand tables need to be truncated and reloaded into brand.delta tables. Could you please let me know if this is possible within databricks using spark or dynamic SQL?

Alex Ott · Accepted Answer

It's easy to do, really there are few variants:

in Spark - read data from source table, filter out, etc., and use .saveAsTable in the overwrite mode:

df = spark.read.table("products")
... transform df
brand_table_name = "brand1"
df.write.mode("overwrite").saveAsTable(brand_table_name)

in SQL by using CREATE OR REPLACE TABLE (You can use spark.sql to substitute variables in this text):

CREATE OR REPLACE TABLE brand1 
USING delta
AS SELECT * FROM products where .... filter condition

for list of brands you just need to use spark.sql with loop:

for brand in brands:
  spark.sql(f"""CREATE OR REPLACE TABLE {brand} 
USING delta
AS SELECT * FROM products where .... filter condition""")

P.S. Really, I think that you just need to define views (doc) over the products table, that will have corresponding condition - in this case you avoid data duplication, and don't incur computing costs for that writes.

How to create many tables programatically?

Answers (1)

Related Questions