Reputation: 2777
DETAILS
I have a quiz (let’s call it quiz1). Quiz1 uses the same wordlist each time it is generated. If the user needs to, they can skip words to complete the quiz. I’d like to store those skipped words in mysql and then later perform statistics on them.
At first I was going to store the missed words in one column as a string. Each word would be separated by a comma.
|testid | missedwords | score | userid |
*************************************************************************
| quiz1 | wordlist,missed,skipped,words | 59 | 1 |
| quiz2 | different,quiz,list | 65 | 1 |
The problem with this approach is that I want to show statistics at the end of each quiz about which words were most frequently missed by users who took quiz1.
I’m assuming that storing missed words in one column as above is inefficient for this purpose as I'd need to extract the information and then tally it -(probably tally using php- unless I stored that tallied data in a separate table).
I then thought perhaps I need to create a separate table for the missed words The advantage of the below table is that it should be easy to tally the words from the table below.
|Instance| missed word |
*****************************
| 1 | wordlist |
| 1 | missed |
| 1 | skipped |
Another approach I could create a table with tallys and update it each time quiz1 was taken.
Testid | wordlist| missed| skipped| otherword|
**************************************************
Quiz1 | 1 | 1| 1| 0 |
The problem with this approach is that I would need a different table for each quiz, because each quiz will use different words. Also information is lost because only the tally is kept not the related data such which user missed which words.
Question
Which approach would you use? Why? Alternative approaches to this task are welcome. If you see any flaws in my logic please feel free to point them out.
EDIT Users will be able to retake the quiz as many times as they like. Their information will not be updated, instead a new instance would be created for each quiz they retook.
Upvotes: 7
Views: 411
Reputation: 40685
The best way to do this is to have the word collection completely normalized. This way, analyses will be easy and fast.
quiz_words with wordID, word
quiz_skipped_words with quizID, userID, wordID
To get all the skipped words of a user:
SELECT wordID, word
FROM quiz_words
JOIN quiz_skipped_words USING (wordID)
WHERE userID = ?;
You could add a group by clause to have group counts of the same word.
To get the count of a specific word:
SELECT COUNT(*)
FROM quiz_words
WHERE word LIKE '?';
Upvotes: 3
Reputation: 4736
Keeping as much information as possible (and being able to compile user-specific stats later as well as overall stats now) I would create a table structure similar to:
Stats
quizId | userId | type| wordId|
******************************************
1 | 1 | missed| 4|
1 | 1 | skipped| 7|
Where type
can either be an int
defining the different types of actions, or a string representation - depending on if you believe it can ever be more. ^^
Then:
Quizzes
quizId | quizName|
********************
1| Quiz 1|
With the word list made for each quiz like:
WordList (pk: wordId)
quizId | wordId| word|
***************************
1 | 1 | Cat|
1 | 2 | Dog|
You would have your user
table however you want, we are just linking the id
from it in to this system.
With this, all id
fields will be non-unique keys in the stats
table. When a user skips or misses a word, you would add the id
of that word to the stats
table along with relevant quizId
and type
. Getting stats this way would make it easy as a per-user
basis, a per-word
basis, or a per-type
basis - or a combination of the three. It will also make the word list for each quiz easily available as well for making the quizzes. ^^
Hope this helps!
Upvotes: 1
Reputation: 8090
The best solution (from my pov) for what are you trying to achieve is the normalized aproach:
test
table which has test_id
column and other columnsmissed_words
table which has id
(AI PK) and word
(UQ) , here you can also have a hits
column that should be incremented each time that a association to this word is made in test_missed_words
table this way you have the stats that you want already compiled and you don't need them to be calculated from a select querytest_missed_words
which is a link table that has test_id
and missed_word_id
(composite PK)This way you do not have redundant data (missed words) and you can extract easily that stats that you want
Upvotes: 1
Reputation: 182
why not have a quiz table and quiz_words table, the quiz_words table would store id,quizID,word as columns. Then for each quiz instance create records in the quiz_words table for each word the user did use.
You could then run mysql counts on the quiz_words table based on quizID and or quiz type
Upvotes: 1
Reputation: 12168
According to database normalization theory, second approach is better, because ideally one relational table cell should store only one value, which is atomic and unsplitable. Each word is an entity instance.
Also, I might suggest to not create Quiz-Word
tables, but reserve another column in Missed-Word
table for quiz, for which this word was specified, then use this column as a foreign key for Quiz
table. Then you probably may avoid real time table generation (which is a "bad practice" in database design).
Upvotes: 1