Frederik Brinck Jensen
Frederik Brinck Jensen

Reputation: 523

Database Structure with a 3rd Dimension of Time

I am in the middle of building the underlaying database structure of a project I am working on, and I wanted some general advice on this from DB savvy people.

I am going to store information on people i.e.:

    id  | name | position   | ...
    ----+------+------------+----
    1   | Me   | Programmer | ...
    ----+------+------------+----
    ... | ...  | ...        | ...

My key to concern is how to store the information about the position and other similar fields, since I want to them to include an element of time making the matrix 3-dimensional. I want to be able to extract data that can tell when a person occupied a special position, i.e.:

    Me | 2009-2011 Programmer | 2011-2013 Project Coordinator

To achieve this goal, what would be most appropriate - to have one field that holds all this data, or to have many fields that hold little data each?

1) To make use of xml as the content of the field which I can extract, manipulate and put back in, such as:

    <person name="Me">
      <position from="2009" to="2011">Programmer</position>
      <position from="2011" to="2013">Project Coordinator</position>
    </person>

2) To make another table similiar to this one:

    id  | person_id | position   | date_from | date_to
    ----+-----------+------------+-----------+--------
    1   | 1         | Programmer | 2009      | 2011
    ----+-----------+------------+-----------+--------
    2   | 1         | Proj. Cord.| 2011      | 2013

EDIT:

    id  | person_id | type (/type_id)     | content  | date_from | date_to
    ----+-----------+---------------------+----------+-----------+--------
    1   | 1         | Position (/1)       | Program. | 2009      | 2011
    ----+-----------+---------------------+----------+-----------+--------
    2   | 1         | Position (/1)       | Proj.Cor.| 2011      | 2013
    ----+-----------+---------------------+----------+-----------+--------
    3   | 1         | Organization (/2)   | Webbureau| 2011      | 2013

EDIT END

And then just extract every field belonging to this person?

How is the trade-off between extracting lots of data from one field vs. little data from many fields (in theory though, 'cause the fields I will create will at maximum be around 100-150 lines of xml, I presuppose). Or is there even a better way to create a 3 dimensional database?

Thanks in advance

Upvotes: 0

Views: 94

Answers (2)

Philippe Grondier
Philippe Grondier

Reputation: 11138

Your proposition n#1 will result in headaches querying the data. Imagine something like "who was working here, and at which position, on the 4th of July 1976"?

Your proposition n#2 is what I call a "Chronology table". It's very efficient, and you can use it for all kinds of "chronological" events, such as:

  • procedure ISO whatever, version 1.034, was valid from 02.02.1998 to 01.01.2003.
  • This person's passport was issued on 1-Nov-2009 and is valid up to 30-Oct-2019.
  • This room is reserved for a meeting on the 15th of March, from 10 to 12
  • Etc

In all these cases (and others) , you can use the "chronology" strategy. It means also that, on the development side, you can easily define some "chronology" object, methods and properties to manipulate this kind of data.

Your edit is a proposal to do what we could call a generalization, when deciding to put similar entities in the same table. Why not? Just be careful not to generalize 'that much'. You could stick to, for example, a 'personActivity' chronological table, where you would records all professional activities linked to a person, from job position to trainings. I would then favor an additional Tbl_Activity table, listing all activities that can be linked to a person.

When it comes to performance, you can do a lot by correctly indexing your table columns. Indexes on your foreign keys (person, activity) will make things very fast when requesting on these fields. Your problem could potentially be linked to dates querying. You could think of adding some 'denormalized' boolean fields like 'activityIsCurrent': the 'current' notion is de facto linked to the inexistence of a value for the date_to field, so we have de facto some redundancy in the data. But having a boolean holding this info makes writing and executing queries a lot lot simpler...

We have here more than 100 000 chronological records on our staff data, and it works really ok following these advices.

Upvotes: 0

Mohsen Heydari
Mohsen Heydari

Reputation: 7284

Following the Normal Forms will be the answer:

what would be most appropriate - to have one field that holds all this data, or to have many fields that hold little data each?

A relation is in first normal form if the domain of each attribute contains only atomic values.
One of the sub rules of 1NF is ensuring that there are no repeating groups of data.

enter image description here
Current position of the people will be stored in People table.
When the current position is changing, first get the old position (if there is any) form PositionHistory and update the end-date to sysDate, then insert a new record to PositionHistory with new position and start-date of sysDate and end date with value of null.

Upvotes: 1

Related Questions