Price Jones
Price Jones

Reputation: 2008

Database Design: Should I create one table or two for this scenario?

The scenario is Time Cards. Employees clock in and clock out on a TimeCardHeader table, but enter Details in a TimeCardDetail table. However, they can enter at least two different kinds of details... and this is my question. Do I create two tables representing each kind, or one table with a Boolean flag that interprets the meaning of the table?

Here are the fields (this example is small, others have many fields):

    Id (PK)
    Version
    StartTime
    EndTime
    LaborDetailDescription
    LaborType: Can be direct or indirect.

If LaborType is Indirect the remaining fields are these:

    IndirectNumber (FK)

If LaborType is Direct the remaining fields are these:

    JobNumber (FK)
    JobType
    DirectType: Can be Production or Setup

If DirectType is Production the remaining fields are these:

    GoodQty
    ScrapQty

If DirectType is Setup the remaining fields are these:

    SetupPercent

So... Do I create one table with all of those fields, but when a type is set some fields are blank (which means code, reporting, queries, etc, will need to be interpreting the database), or do I create two tables DirectLaborDetail and IndirectLaborDetail and store the data neatly into the appropriate table? In this case, even DirectLabor is broken into DirectLaborSetup and DirectLaborProduction.

I am asking this question along a number of dimensions:

  1. Theoretical purity according to database design principles.
  2. Performance Issues.
  3. Difficulty in Query creation (this would also include coding against it).
  4. Any other consideration I may not have listed here.

EDIT: More detail added...

Option 1

/*I intentionally left out the type information*/
CREATE TABLE TimeCardDetail
(
    Id,
    Version,
    TimeCardHeaderId, /*Not depicted here, FK*/
    StartTime,
    EndTime,
    LaborDetailDescription,
    LaborType, /*FK*/
    IndirectId, /*FK*/
    JobId, /*FK*/
    DirectType, /*FK*/
    GoodQty,
    ScrapQty,
    SetupPercent
);

Option 2

CREATE TABLE TimeCardDetail
(
    Id,
    Version,
    StartTime,
    EndTime,
    LaborDetailDescription
);

CREATE TABLE DirectLaborDetail
(
    Id,
    Version,
    TimeCardHeaderId, /*Not depicted here, FK*/
    JobId, /*FK*/
    DirectType, /*FK*/
    GoodQty,
    ScrapQty,
    SetupPercent,
    TimeCardDetailId /*FK*/
);

CREATE TABLE IndirectLaborDetail
(
    Id,
    Version,
    TimeCardHeaderId, /*Not depicted here, FK*/
    IndirectId, /*FK*/
    TimeCardDetailId, /*FK*/
);

I prefer this as a human being because I can see clearly the business meaning of the data, and yet at the same time, everything is cleanly in its place, no interpretation required. Queries become a bit more interesting because if I want to see all the detail for a specific TimeCardHeader, I need to look at two tables. But is that really a problem with today's computing power?

Option 3

Like Option 2 except we reverse the relationship...

CREATE TABLE TimeCardDetail
(
    Id,
    Version,
    TimeCardHeaderId, /*Not depicted here, FK*/
    StartTime,
    EndTime,
    Description,
    LaborType, /*FK*/
    FKId, /*would link to the DirectLabordetail or IndirectLaborDetail depending on LaborType*/
);

I don't this option because FKId has meaning depending on LaborType.

Upvotes: 0

Views: 274

Answers (1)

mucio
mucio

Reputation: 7119

I would go for a single table with all the columns and then some of them will be loaded with values or left empty if not required. This solution will make your life easier.

Only if you think that you will query details with different LaborTypes always separately then the two tables solution will be a good choice, but even in that case you have to decide if the gain in performance (two smaller tables are easier to handle for the db) is worth the lost in terms of developments (insert in two tables, query for two tables, etc.)

About your point:

  1. Theoretical purity. Not sure if such thing exists, but both approaches are theoretically valid. The practice will tell you which is the best for your case.

  2. Performance. Two tables will be smaller, faster to query, but the you have to maintain more code. Until you don't have millions/billions of rows I won't worry too much about performance. A single table can give you performance issues, but indexes, partitions, caches will help you anyway.

  3. Difficulty of query creation. My suggestion is a table like this:

    Id (PK)
    Version
    StartTime
    EndTime
    LaborDetailDescription
    LaborType (FK) 
    IndirectNumber (FK)
    JobNumber (FK)
    JobType
    DirectType (FK) 
    GoodQty
    ScrapQty
    SetupPercent
    

    With FKs also for LaborType and DirectType to two small lookup table, so you can store only LaborType_id and DirectType_id in your table. Also for the missing foreign keys, because you don't have IndirectNumber for Indirect LaboryType, just create a dummy record to maintain referential integrity. I think that maintain a similar table should be pretty simple, you will just need a couple of joins for the FKs.

  4. Maybe, but I think for now it's enough to start

Upvotes: 1

Related Questions