Reputation: 523
I am in the middle of building the underlaying database structure of a project I am working on, and I wanted some general advice on this from DB savvy people.
I am going to store information on people i.e.:
id | name | position | ...
----+------+------------+----
1 | Me | Programmer | ...
----+------+------------+----
... | ... | ... | ...
My key to concern is how to store the information about the position and other similar fields, since I want to them to include an element of time making the matrix 3-dimensional. I want to be able to extract data that can tell when a person occupied a special position, i.e.:
Me | 2009-2011 Programmer | 2011-2013 Project Coordinator
To achieve this goal, what would be most appropriate - to have one field that holds all this data, or to have many fields that hold little data each?
1) To make use of xml as the content of the field which I can extract, manipulate and put back in, such as:
<person name="Me">
<position from="2009" to="2011">Programmer</position>
<position from="2011" to="2013">Project Coordinator</position>
</person>
2) To make another table similiar to this one:
id | person_id | position | date_from | date_to
----+-----------+------------+-----------+--------
1 | 1 | Programmer | 2009 | 2011
----+-----------+------------+-----------+--------
2 | 1 | Proj. Cord.| 2011 | 2013
EDIT:
id | person_id | type (/type_id) | content | date_from | date_to
----+-----------+---------------------+----------+-----------+--------
1 | 1 | Position (/1) | Program. | 2009 | 2011
----+-----------+---------------------+----------+-----------+--------
2 | 1 | Position (/1) | Proj.Cor.| 2011 | 2013
----+-----------+---------------------+----------+-----------+--------
3 | 1 | Organization (/2) | Webbureau| 2011 | 2013
EDIT END
And then just extract every field belonging to this person?
How is the trade-off between extracting lots of data from one field vs. little data from many fields (in theory though, 'cause the fields I will create will at maximum be around 100-150 lines of xml, I presuppose). Or is there even a better way to create a 3 dimensional database?
Thanks in advance
Upvotes: 0
Views: 94
Reputation: 11138
Your proposition n#1 will result in headaches querying the data. Imagine something like "who was working here, and at which position, on the 4th of July 1976"?
Your proposition n#2 is what I call a "Chronology table". It's very efficient, and you can use it for all kinds of "chronological" events, such as:
In all these cases (and others) , you can use the "chronology" strategy. It means also that, on the development side, you can easily define some "chronology" object, methods and properties to manipulate this kind of data.
Your edit is a proposal to do what we could call a generalization, when deciding to put similar entities in the same table. Why not? Just be careful not to generalize 'that much'. You could stick to, for example, a 'personActivity' chronological table, where you would records all professional activities linked to a person, from job position to trainings. I would then favor an additional Tbl_Activity table, listing all activities that can be linked to a person.
When it comes to performance, you can do a lot by correctly indexing your table columns. Indexes on your foreign keys (person, activity) will make things very fast when requesting on these fields. Your problem could potentially be linked to dates querying. You could think of adding some 'denormalized' boolean fields like 'activityIsCurrent': the 'current' notion is de facto linked to the inexistence of a value for the date_to field, so we have de facto some redundancy in the data. But having a boolean holding this info makes writing and executing queries a lot lot simpler...
We have here more than 100 000 chronological records on our staff data, and it works really ok following these advices.
Upvotes: 0
Reputation: 7284
Following the Normal Forms will be the answer:
what would be most appropriate - to have one field that holds all this data, or to have many fields that hold little data each?
A relation is in first normal form if the domain of each attribute contains only atomic values.
One of the sub rules of 1NF is ensuring that there are no repeating groups of data.
Current position of the people will be stored in People
table.
When the current position is changing, first get the old position (if there is any) form PositionHistory
and update the end-date to sysDate, then insert a new record to PositionHistory
with new position and start-date of sysDate and end date with value of null.
Upvotes: 1