Reputation: 598
How do you design a data warehouse? Assuming you have raw data in S3, you copy all the data into a data warehouse, and then start modelling the data (So ELT right?). Do you have a single data model that shows all the relationships, or do you create a data model per pipeline?
Eg. If I were to model all data related to customer purchases for an e-commerce store, would that be one data model, and then separately another data model for all the suppliers and related information?
I've been scouring the internet and have not found a clear answer. I'd really appreciate the help!
So far I have tried only a single data model - but I'm not sure how that lines up with ELT allowing for data being modelled as needed?
Upvotes: 0
Views: 209
Reputation: 905
Designing a data warehouse generally involves the following steps:
identifying the data that is needed to support the business processes, the data sources, and the data quality requirements
decide on the architecture of the data warehouse(star or snowflake schema) + the hardware/software required + data integration strategy
defining the structure of the data warehouse(dimensions, facts + relationships between them)
extracting data from the various data sources and loading it into the data warehouse(using ETL or ELT tool)
ensure that the data is accurate, consistent, and reliable.
deploy and maintain(monitoring the performance of the data warehouse + making any necessary changes to the data model or data integration processes)
Regarding the part about data modeling:
it is generally recommended to create a separate data model for each subject area or pipeline(allows for better organization of the data and easier maintenance of the data model)( exp: you could create a data model for customer purchases and another data model for supplier information)
it is important to ensure that the data models are consistent and can be integrated as needed
ELT allows for greater flexibility in data modeling, as the data can be transformed and modeled as needed in the data warehouse -- it is still important to have a clear understanding of the business requirements and a well-designed data model to ensure that the data warehouse meets the needs of the organization.
Upvotes: 0