Reputation: 24305
I'm wondering about the pros and cons of storing summed values in a field vs. performing a sum()
operation on a separate table. An example would be retrieving a list of items along with the number of purchases for each item. In this case, one could have an Items
table and Purchases
table whose purchases would need to be summed.
The pros of a summed value:
JOIN
s necessary, e.g., Items.item_id=Purchases.item_id
)The pros of sum()
:
UPDATE
and INSERT
statements to change them (e.g., an INSERT
of the purchase in the Purchases
table and an UPDATE
of the purchase_count
field of Items
. Upvotes: -1
Views: 1499
Reputation: 1269953
THere is no right answer to such a question. The answer depends entirely on how the data is being used.
Two extremes. If you are rarely querying the data and there are lots of changes to historical data, then your system could spend a lot of effort keeping the sums consistent, when they are not needed.
On the other hand, if the data is "insert-only", and every record is repeatedly retrieved with the sum, then having the sums saves effort.
In general, the tendency is to favor the more normalized approach -- that is, doing the sum()
at query time rather than maintaining it as a business rule. One clear advantage is that the values are consistent over time.
A second advantage is that the analytics can vary. Maybe today you need count of items. Maybe next week, you need the count of units. And the count of dollars. Or the net price difference. These change over time. Trying to capture them all up-front is challenging. The solution here is the break the problem into two pieces. Store the normalized data (with no magical sums) in one "database". Store summaries in "data marts" that can be modified to meet end-user goals. The data marts would be updated periodically -- once per day or once per week.
Normalized base data with period summaries is a strong architecture that meets many needs. But, there may be some cases where it is really a better idea to do the calculations on insert
/update
/delete
rather than on select
. Because such a solution often involves much more complexity (multiple triggers for instane), it has to justify itself strongly over a more normalized solution.
Wikipedia: Database Normalization
Upvotes: 2
Reputation: 2733
It depends on the application nature and usage.
You should make a calculation or review the usage(for an existing application) about which of the queries are more often. DML types of queries:
SELECT
queries for browsing items and displaying information regarding the items - for example regular e-shops.UPDATE/INSERT
queries - for example heavy store application that mainly consist of managing the store inventory.Upvotes: 1