Vinayak Garg
Vinayak Garg

Reputation: 6616

Should derived attributes be duplicated as column or computed each time?

This is a problem of space vs time, I guess.

Consider a online quizzing application. In the browse-quizzes, with every quiz, I wish to show how many questions it has.

In the Quiz table, I don't have a column for storing the number of questions. To find that number of questions, I can do

SELECT COUNT(question_id) 
FROM Quiz 
NATURAL JOIN Quiz_data 
WHERE quiz_id = '$quiz_id'

Note that Quiz_data is a cross table of Quiz and Question_bank (i.e. it stores PKs of both the 2 table).

Note that if I store the *number_of_question* in Quiz, questions may be added/subtracted from Quiz. So apart from delete and insert, one needs to update as well.

Question: should I compute each time, or store the value in the Quiz table? The post title says the same question in general sense.

Upvotes: 2

Views: 945

Answers (2)

marc_s
marc_s

Reputation: 754578

You need to know

(1) how often your data changes, and
(2) how much effort it takes to determine that number.

As a general recommendation, I would look at:

  • If the data changes only very rarely -> you should calculate and then store that value.

  • If the cost of determining that number is very high -> try to calculate once, store and reuse.

  • On the other hand, if having an accurate count is crucial -> determine it as needed.

This is not a clear-cut yes-or-no decision - it really depends on your data, your requirements, and how costly it is to calculate such a number/count.

Upvotes: 2

Under normal circumstances, you should count the number of rows at run time. Counting at run time will always give you the right number of rows. But in some cases, counting at run time takes too long. (It doesn't look like it should ever take too long in your case, though.)

In cases where counting at run time takes too long, it makes sense to store the count somewhere. Personally, I don't even like to do that to improve performance, but if I can program the dbms to guarantee the count is always right, I don't mind it too much.

If the dbms can't be programmed to guarantee integrity, and the count is critical, I will usually run a cron job to either update all the counts or to look for counts that don't match the rows.

Upvotes: 2

Related Questions