Reputation: 862

Table design for hierarchical data

i am trying to design a table which contains sections and each section contains tasks and each task contains sub tasks and so on. I would like to do it under one table. Please let me know the best single table approach which is scalable. I am pretty new to database design. Also please suggest if single table is not the best approach then what could be the best approach to do this. I am using db2.

Upvotes: -2

Answers (4)

Apokralipsa

Reputation: 2724

If I understand correctly the original poster does not know, how many levels of hierarchy will be needed (hence "and so on"). His problem is to create a design that can hold a structure of any depth.

Imho that is a complex issue that does not have a single answer. When implementing such a design you need to count such factors as:

Will the structure be fairly constant? (How many writes?) How often will this structure be read? What operations will need to be possible? (Get all children objects of a given object? Get the parent object? Get the direct children?)

If the structure will be constant You could use the nested set model (http://en.wikipedia.org/wiki/Nested_set_model)

In this way the table has a 'left' and 'right' column. The parent object has its left and right column encompasing the values of any of its children object.

In that way you can list all the children of an object using a query like this:

SELECT child.id
FROM table AS parent
JOIN table AS child
    ON child.left BETWEEN parent.left AND parent.right
    AND child.right BETWEEN parent.left AND parent.right
WHERE
    parent.id = @searchId

This design can be VERY fast to read, but is also EXTREMELY costly when the structure changes (for example when adding a child to any object You will have to update any object with a 'right' value that is higher than the inserted one).

If you need to be able to make changes to structure in real time you should probably use a design with two tables - one holding the objects, the second the structure (something like parentId, childId, differenceInHierarchyLevels).

Upvotes: 0

WarrenT

Reputation: 4532

Put quite simply, I would say use 1 table for tasks.

In addition to all its various other attributes, each task should have a primary identifier, and another column to optionally contain the identifier of its parent task.

If you are using DB2 for z/OS, then you will use a recursive query with a common table expression. Otherwise you you can use a hierarchical recursive query in DB2 for i, or possibly in DB2 for LUW (Linux, Unix, Windows).

Other designs requiring more tables, each specializing in a certain part of the task:subtask relationship, may needlessly introduce issues or limitations.

Upvotes: 1

Edper

Reputation: 9322

If you are going to put everything into one table although convenient will be inefficient in the long run. This would mean you will be storing unnecessary repeated groups of data in your database which would not be processor and memory friendly at all. It would in fact violate the Normalization rules and to be more specific the 1st Normal Form which says that there should be no repeating groups that could be found in your table. And it would actually also violate the 3rd Normal Form which means there will be no (transitional) dependency of a non-primary key to another non-primary key.

To give you an illustration, I will put your design into one table. Although I will be guessing on the possible fields but just bear with it because this is for the sake of discussion. Look at the graphics below:

enter image description here

If you look the graphics above (although this is rather small you could download the image and see it closer for yourself), the SectionName, Taskname, TaskInitiator, TaskStartDate and TaskEndDate are unnecessary repeated which as I mentioned earlier a violation of the 1st Normal Form.

Secondly, Taskname, TaskInitiator, TaskStartDate and TaskEndDate are functionally dependent on TaskID which is not a primary key instead of SectionID which in this case should be the primary key (if on a separate table). This is violation of 3rd Normal Form which says that there should be no Transitional Dependence or non-primary key should be dependent on another non-primary key.

Although there are instances that you have to de-normalized but I believe this one should be normalized. In my own estimation there should be three tables involved in your design, namely, Sections,Tasks and SubTasks that would like the one below.

enter image description here

Section is related to Tasks, that is, a section could have many Tasks. And Task is related to Sub-Tasks, that is, a Task could have many Sub-tasks.

Upvotes: 0

TGH

Reputation: 39248

There are a few ways to do this.

One idea is to use two tables: Sections and Tasks

There could be a one to many relationship between the two. The Task table could be designed as a tree with a TaskId and a ParentTaksId which means you can have Tasks that go n-levels deep (sub tasks of sub tasks og sub tasks etc). Every Task except for the root task will have a parent.

I guess you can also solve this by using a single table where you just add a section column to the Task table I described above.

Upvotes: 0

Table design for hierarchical data

Answers (4)

Related Questions