SamHuckaby
SamHuckaby

Reputation: 1162

The optimal way to store multiple-selection survey answers in a database

I'm currently working on a survey creation/administration web application with PHP/MySQL. I have gone through several revisions of the database tables, and I once again find that I may need to rethink the storage of a certain type of answer.

Right now, I have a table that looks like this:

survey_answers

id          PK
eid
sesid
intvalue    Nullable
charvalue   Nullable

id = unique value assigned to each row

eid = Survey question that this answer is in reply to

sesid = The survey 'session' (information about the time and date of a survey take) id

intvalue = The value of the answer if it is a numerical value

charvalue = the value of the answer if it is a textual representation

This allowed me to continue using MySQL's mathematical functions to speed up processing.

I have however found a new challenge: storing questions that have multiple responses. An example would be:

Which of the following do you enjoy eating? (choose all the apply)

Now, when I want to store the result, I'm not sure of the best way to handle it. Currently, I have a table just for multiple choice options that looks like this:

survey_element_options

id        PK
eid
value

id = unique value associated with each row

eid = question/element that this option is associated with

value = textual value of that option

With this setup, I then store my returned multiple selection answers in 'survey_answers' as strings of comma separated id's of the element_options rows that were selected in the survey. (ie something like "4,6,7,9") I'm wondering if that is indeed the best solution, or if it would be more practical to create a new table that would hold each answer chosen, and then reference back to a given answer row which in turn references back to the element and ultimately the survey.


EDIT

for anyone interested, here is the approach I ended up taking (In PhpMyAdmin Relations View):

Database Relationships and Table Structure

And a rudimentary query to gather the counts for a multiple select question would look like this:

SELECT e.question AS question, eo.value AS value, COUNT(eo.value) AS count
FROM survey_elements e, survey_element_options eo, survey_answer_options ao
WHERE e.id = 19
AND eo.eid = e.id
AND ao.oid = eo.id
GROUP BY eo.value

Upvotes: 10

Views: 18462

Answers (4)

Matthew
Matthew

Reputation: 9949

This really depends on a lot of things.

  1. Generally, storing lists of comma separated values in a database is bad, especially if you plan to do anything remotely intelligent with that data. Especially if you want to do any kind of advanced reporting on the answers.
  2. The best relational way to store this is to also define the answers in a second table and then link them to the users response to a question in a third table (with multiple entries per user-question, or possibly user-survey-question if the user could take multiple surveys with the same question on it.

This can get slightly complex as a a possible scenario as a simple example:

Example tables:

  • Users (Username, UserID)
  • Questions (qID, QuestionsText)
  • Answers (AnswerText [in this case example could be reusable, but this does cause an extra layer of complexity as well], aID)
  • Question_Answers ([Available answers for this question, multiple entries per question] qaID, qID, aID),
  • UserQuestionAnswers (qaID, uID)

Note: Meant as an example, not a recommendation

Upvotes: 8

M Lamb
M Lamb

Reputation: 199

It's a horses for courses thing really.

You can store as a comma separated string (But then what happens when you have a literal comma in one of your answers).

You can store as a one-to-many table, such as:

survey_element_answers

id                PK
survey_answers_id FK
intvalue          Nullable
charvalue         Nullable

And then loop over that table. If you picked one answer, it would create one row in this table. If you pick two answers, it will create two rows in this table, etc. Then you would remove the intvalue and charvalue from the survey_answers table.

Another choice, since you're already storing the element options in their own table, is to create a many-to-many table, such as:

survey_element_answers

id                         PK
survey_answers_id          FK
survey_element_options_id  FK

Again, one row per option selected.

Another option yet again is to store a bitmask value. This will remove the need for a many-to-many table.

survey_element_options

id              PK
eid             FK
value           Text
optionnumber    unique for each eid
optionbitmask   2 ^ optionnumber

optionnumber should be unique for each eid, and increment starting with one. There will impose a limit of 63 options if you are using bigint, or 31 options if you are using int.

And then in your survey_answers

id            PK
eid
sesid
answerbitmask bigint

Answerbitmask is calculated by adding all of the optionbitmask's together, for each option the user selected. For example, if 7 were stored in Answerbitmask, then that means that the user selected the first three options.

Joins can be done by:

WHERE survey_answers.answerbitmask & survey_element_options.optionbitmask > 0

So yeah, there's a few options to consider.

Upvotes: 1

Anri
Anri

Reputation: 6265

Convert primary key to not unique index and add answers for the same question under the same id.

For example.

id  | eid | sesid | intval | charval
 3     45      30        2  
 3     45      30        4

You can still add another column for regular unique PK if needed.

Keep things simple. No need for relation here.

Upvotes: 3

Vincent
Vincent

Reputation: 51

If you don't use the id as a foreign key in another query, or if you can query results using the sesid, try a many to one relationship. Otherwise I'd store multiple choice answers as a serialized array, such as JSON or through php's serialize() function.

Upvotes: -1

Related Questions