Designing a Semantic Layer with Multiple Data Sources and Complex Transformation Rules using snowflake and dbt

Question

I am working on a Data Warehouse (DWH) project that integrates data from three different sources, each providing four tables (12 tables in total). These data sources are unrelated except for some common fields such as Market_ID, Category_ID, Product_ID, Cost, and Sales.

Each data source has around 20 rules/transformations that need to be applied. Here are a couple of simple examples to illustrate:

Geography_Targeted = Market_country, Brand Material = Product_Core_Material

The rules are much more complex in reality. Additionally, a product can be compliant with one rule and non-compliant with another on the same day. The rules are independent of each other.

Here is an abstract example of the data:

Market_ID	Category_ID	Product_ID	Date	Geography	Market_Country	Material	Product_Core
ABC	XYZ	12345	1-1-2020	Japan	JPN	Steel;Aluminium	Steel
ABC	XYZ	12345	2-1-2020	Japan		Steel;Aluminium	Steel
ABC	XYZ	12345	3-1-2020	Japan	JPN	Steel;Aluminium
ABC	XYZ	12345	4-1-2020	Japan	JPN	Steel;Aluminium	Steel
ABC	XYZ	12345	5-1-2020	Japan	JPN

The data can reach up to 15 million records per month.

My goal is to design an efficient semantic layer to utilize this data in Power BI without hitting refresh limits. The proposed solution is to create a table for each rule with the following structure: Market_ID, Category_ID, Product_ID, Cost, Sales, Rule_Compliance_Flag.

Is there a better approach to handle this scenario? Any suggestions on optimizing the semantic layer for Power BI to efficiently manage and utilize this data?

Thank you for your help!

Designing a Semantic Layer with Multiple Data Sources and Complex Transformation Rules using snowflake and dbt

Answers (0)

Related Questions