Reputation: 161
Firstly, I am going to explain my problem by using example from real life. Let’s say that we are company and we are selling different means of transport, e.g. cars, buses, trucks, trains, planes, etc. Let's say that we have around 10.000.000 different items with daily changes.
For each item we have an unique name (e.g. car Audi A8 X or plane Boing 747-200B Y) where X and Y are unique values. Don’t worry about naming because it works just fine.
For each item we also have some special data. Data depends on type, e.g. for car: dimensions (length, width, height …), powertrain, etc. For planes we have e.g. length, interior width, wingspan, wing area, wing sweep, etc.
And now the problem … I would like to put all this data from different Excel files and paper to database.
Question 1: Which database model is better?
Idea #1: I am going to create one table, called items where I am going to store only name of product which we are selling (e.g. car Audi A8 X, plane Boing 747-200B Y, etc.). And than in other tables (car, plane, train …) I will store extra data for cars / planes / trains. So if I would like to get all data of e.g. car than I will have to check table car. If I would like to get all data of e.g. train than I will have to check table train.
Idea #2: Should I create one table where I am going to store all item’s names (just like in Idea #1, items). And than additional pivot table (e.g. data with fields: item, key, value) where I will be able to find all informations?
Question 2: I need history of all data. In first case I will have to duplicate row from e.g. table car just because one fields is different. But for Idea #2 ... for all rows in pivot table data would be necessary to have information if data is valid (or when was valid).
Can you please help me? I have no idea which model is better or what is actually using in production. Also ... is there any good book about storing historical data to database?
Thanks!
Upvotes: 0
Views: 1131
Reputation: 18940
You present two problems to us. The first is organizing specialized data about subtypes (cars, buses, trucks, etc.). The second is dealing with temporal (historical) data.
Your idea #1 resembles a design pattern known as "Class Table Inheritance". If you will do a search on this phrase, you will find many articles outlining exactly how it works. These will pretty much confirm your initial reaction, but they will add lots more helpful detail. You will also find numerous references to previous Q&A entries in this site, and in the DBA site.
For an alternate design, look up "Single Table Inheritance". This stores everything in a single fat table, with NULLS in spaces that don't pertain to the case at hand.
I am not sure what you mean by storing something in a pivot table. I'm familiar with pivot tables in Excel, but I have always used them as results calculated from ordinary tables where the data is stored.
How to deal with historical data is a separate issue.
Upvotes: 2