user17753
user17753

Reputation: 3161

null values in optional columns

Suppose I have a Table1, with three Columns: ID (Primary Key, Identity), A, and B

Now, suppose I have 3 methods, assuming they share nothing in common column-wise:

Method 1: C, D, E

Method 2: F, G, H, I

Method 3: J

I could make one table: ID, A, B, C, D, E, F, G, H, I, J, M

Where M, is the name of the Method (or a Method ID).

However, if 90% of the time Method 3 is used, there would be many null values.
Is this a problem? If so, is there a better way to set this up?

If I make each method its own table entity, how do I ensure each ID has exactly one method matching for it?

If I keep it as one table, how do I ensure ONLY C, D, E are filled in and F thru J are NULL if M is 1?


OK, seems some people had difficultly thinking abstractly, so I'll create a random concrete example applying the above:

Suppose I had records of people performing exercises.
Each record would always have an ID (to uniquely identify the event), a TIME_STARTED, and TIME_ENDED.

However, depending on which exercise they did, there would be different attributes needed. Suppose there were only three exercises:

Elliptical: INCLINE, LEVEL, SPEED

Crunches: User_Weight, Reps, Delay, Extra_Weight

Dead Lift: Weight_Lifted

For each ID there could be only one "method". Applying this, see above questions.

Upvotes: 2

Views: 1037

Answers (2)

Chains
Chains

Reputation: 13157

A couple of ways...

First of all, Jason's answer hits one of them -- breaking your design out into multiple tables -- and that can be a pretty good approach for most things. (a.k.a. a more 'normalized' approach)

Alternatives...

1) You could have a string of name-value pairs in the last field. For example:

ID   Method   DateTimeStart   DateTimeEnd   ValueString
1    1        2012-01-01...   2012-01-01... C:value,D:value,E:value
2    2        2012-01-01...   2012-01-01... F:value,G:value,H:value,I:value

So this might be handy in cases where you can't plan for value types ahead of time. For example, maybe you need to be able to decide on the fly to start recording 'W' values for method 1, and it's not feasible to do a structural mod like you'd have to do in a more normalized design.

This is the common approach our E-sign Form developers use. You can imagine a web form, such as "department leave request" that someone would fill out and route electronically for approval. And then along comes a client who wants them to build a "purchase order" web form, that will have different fields & values that have to be recorded. Rather than create a brand-new table for this (or add columns to existing tables, etc.), they use the name-value-pair model to store all forms data, no matter which form it is, in the same table.

2) You can use table triggers to enforce integrity, if you don't trust the person/process running inserts on your single big table. For example, a before update trigger could check the method number, and then modify the data the person or process was attempting to insert by nullifying any inappropriate data values.

FWIW

Upvotes: 1

Jason
Jason

Reputation: 1325

It sounds like you have a supertype/subtype situation here. In this case, Table1 holds your supertype, and you would want to create a different table to hold each of your subtypes. The PK on these subtype tables would also be an FK to the supertype table. So you would have something like this:

Supertype_table
|    ID(PK)   |  A  |  B  |

Subtype1_table
|  ID(PK&FK)  |  C  |  D  |  E  |

Subtype2_table
|  ID(PK&FK)  |  F  |  G  |  H  |  I  |

Subtype3_table
|  ID(PK&FK)  |  J  |

The point of this schema is to make sure that you don't have a bunch of rows which are mostly nulls. For each method, you would have to write a separate query that would insert/update the appropriate table. With SQL Server, you can make a view which combines all these tables and abstracts away any joins:

CREATE VIEW MyView
SELECT Super.ID, Super.A, Super.B, 
Sub1.C, Sub1.D, Sub1.E, 
Sub2.F, Sub2.G, Sub2.H, Sub2.I, 
Sub3.J
FROM Supertype_table as Super
LEFT OUTER JOIN Subtype1_table as Sub1 on Super.ID = Sub1.ID
LEFT OUTER JOIN Subtype2_table as Sub2 on Super.ID = Sub2.ID
LEFT OUTER JOIN Subtype3_table as Sub3 on Super.ID = Sub3.ID

So then you could just write something like:

SELECT ID, A, B, J
FROM MyView
WHERE J is not null

EDIT : For OP's comment

In order to ensure that each ID is in one and only one table, you need some kind of identifier on the supertype table. So if you had a column called TypeID, you would create the function:

CREATE FUNCTION [dbo].[SubtypeofSuperID](@ID int)
RETURNS int
AS
BEGIN
  DECLARE @TypeID int;

    SELECT @TypeID = TypeID 
    FROM Supertype_table
    WHERE ID = @ID

  RETURN @TypeID 
END

Then, using that, you can create a check on each of the subtype tables:

ALTER TABLE Subtype1_table CONSTRAINT [CK_CorrectSubtype1]
CHECK ( [dbo].[SubtypeofSuperID]([ID]) = 1 )
ALTER TABLE Subtype2_table CONSTRAINT [CK_CorrectSubtype2]
CHECK ( [dbo].[SubtypeofSuperID]([ID]) = 2 )
ALTER TABLE Subtype3_table CONSTRAINT [CK_CorrectSubtype3]
CHECK ( [dbo].[SubtypeofSuperID]([ID]) = 3 )

Upvotes: 3

Related Questions