Reputation: 2037
Suppose we have a survey where some of the questions are asked across multiple entities.
For example:
Car Brands = [Brand 1, Brand 2, Brand 3, Brand 4...]
This questions will be asked for each one of the car brands (looped).
Question Q01 = (Scale 1-10) Do you think [Car Brand] cars are reliable?
Question Q02 = (Scale 1-10) Do you think [Car Brand] cars are a good value?
...
I'm designing a schema that will power some web based analytic tools, so query performance is important.
The schema will be 3 tables: Records, Questions, Answers
I have two approaches for the answers table:
A) Table: Answers
QuestionId | AnswerValue | BrandOption
Q01 | 7 | 1
Q01 | 5 | 2
Q01 | 4 | 3
Q01 | 8 | 4
B) Table: Answers
QuestionId | AnswerValue
Q01-1 | 7
Q01-2 | 5
Q01-3 | 4
Q01-4 | 8
The queries can be either for one brand at a time or for all the brands, with equal priority for both queries.
Option A seems to give me some advantages if I ever need to do something like a group by, however if most of the queries are for a specific brand, then Option B seems to be more efficient.
Thoughts?
Upvotes: 0
Views: 333
Reputation: 82474
Option A is better, even if you don't see it right now.
Storing multiple values in a single database "cell" is a mistake any way you look at it (though unfortunately, a very common mistake) - not to mention it's a violation of the first normal form - which specifically states that each column can only contain a single atomic value in each row (though the original rule is using a different terminology).
The disadvantages are numerous and some of them are critical, including (but not limited to):
The list goes on and on - but I think anyone should get the picture by now - a database column should be used to store a single value for each row - every time.
Upvotes: 2
Reputation: 1269593
The first version is preferable in my opinion. It makes it easier to look for answers to different questions for a single brand and to the same question across brands.
Munging the question id seems like a poor substitute. For one thing, it precludes simple foreign key relationships to a questions
table and to a brands
table. I'm a big fan of explicit foreign key relationships.
Of course, to make this work, you will need a method to store "no brand" or "brand no relevant". One method is to use NULL
for such answers.
Upvotes: 0