EnterPassword
EnterPassword

Reputation: 598

Does a data warehouse have one data model or many?

How do you design a data warehouse? Assuming you have raw data in S3, you copy all the data into a data warehouse, and then start modelling the data (So ELT right?). Do you have a single data model that shows all the relationships, or do you create a data model per pipeline?

Eg. If I were to model all data related to customer purchases for an e-commerce store, would that be one data model, and then separately another data model for all the suppliers and related information?

I've been scouring the internet and have not found a clear answer. I'd really appreciate the help!

So far I have tried only a single data model - but I'm not sure how that lines up with ELT allowing for data being modelled as needed?

Upvotes: 0

Views: 209

Answers (1)

Lemonina
Lemonina

Reputation: 905

Designing a data warehouse generally involves the following steps:

  1. Identify business requirements:

identifying the data that is needed to support the business processes, the data sources, and the data quality requirements

  1. Plan the architecture:

decide on the architecture of the data warehouse(star or snowflake schema) + the hardware/software required + data integration strategy

  1. Data modeling:

defining the structure of the data warehouse(dimensions, facts + relationships between them)

  1. Data integration:

extracting data from the various data sources and loading it into the data warehouse(using ETL or ELT tool)

  1. Testing and validation:

ensure that the data is accurate, consistent, and reliable.

  1. Deployment and maintenance:

deploy and maintain(monitoring the performance of the data warehouse + making any necessary changes to the data model or data integration processes)

Regarding the part about data modeling:

  • it is generally recommended to create a separate data model for each subject area or pipeline(allows for better organization of the data and easier maintenance of the data model)( exp: you could create a data model for customer purchases and another data model for supplier information)

  • it is important to ensure that the data models are consistent and can be integrated as needed

  • ELT allows for greater flexibility in data modeling, as the data can be transformed and modeled as needed in the data warehouse -- it is still important to have a clear understanding of the business requirements and a well-designed data model to ensure that the data warehouse meets the needs of the organization.

Upvotes: 0

Related Questions