Reputation: 430
I have some tables that have different structures, but contains the same data (in the example it is name and surname).
Edit: Demo structure fiddle - http://sqlfiddle.com/#!2/cce568
However now I need to create some collective table which will store this information + some more details (eg datetime, company name, position etc). This collective table can contain some entity more than once according to the context of the remaining fields.
Is there some pattern how to store this collective table, please? According to Bill Carwin's post (https://stackoverflow.com/a/562030/1092627) I can join all those tables into one, but what should I do if I need to add some information directly into this table?
Thanks in advance for your opinions.
Upvotes: 2
Views: 2788
Reputation: 29629
Craig Larman's book "Applying UML with Patterns" describes the 3 common solutions to this problem.
Your examples are not particularly helpful - there's no logical reason to have 3 different ways of managing a person's name in your database (though this does regularly happen because of data import/export weirdness).
However, it's very common for there to be a "person" entity who might be an employee (with employee_id), a contact (with a link to the prospects table), or a customer (with a customer_id and link to the orders table).
In Larman's book, he gives 3 solutions.
One table to rule them all Here, you create a single table with all known columns. This creates a messy table, and pushes the responsibility for knowing the rules of persisting each subclass to the application layer - the database won't enforce the need for customers to have a customer_id. However, it makes the joins much easier - any table that needs to link to a person can just, well, link to the person table.
Superclass table This cleans things up by extracting the common attributes into a single table - e.g. "person" - and pushes the subclass-specific fields to subclass tables. So, you might have "person" as the superclass table, and "contact", "employee" and "customer" tables with the specific subclass data. The subclass tables have a "person_id" column to link back to the superclass table. This is more complex - it typically requires an additional join when retrieving data - but also far less error prone - you can't accidentally corrupt the data model with a bug that writes invalid attributes for "employee".
Table per subclass - this is what you've described. It introduces a fair amount of duplication into the data model, and you often have conditional joins - "join on table x if person type = y", which can make the data access code tricky.
Upvotes: 7