Reputation: 7493
When I started designing my application database schema few months ago I have been told not to store the same data/calculated data in more than one place in the database(normalization). If I do, I will make a scope of bugs when I update the data in one place and left the other without updating. So I did an orders table and ordersDetails table. Something like this..
-- orders table
+-----+---------+----------+
| ID | clintID | date |
+-----+---------+----------+
| 1 | 1 |2018-02-22|
| 2 | 1 |2018-02-23|
| 3 | 2 |2018-02-24|
+-----+---------+----------+
-- orderDetail table
+-----+---------+------------+----------+----------+
| ID | orderID | itemNumber | quantity | unitPrice|
+-----+---------+------------+----------+----------+
| 1 | 1 | 12345 | 3 | 100.75 |
| 2 | 1 | 12346 | 3 | 100.75 |
| 3 | 2 | 12347 | 3 | 100.75 |
| 4 | 2 | 12345 | 3 | 100.75 |
| 5 | 3 | 12347 | 3 | 100.75 |
| 6 | 3 | 12345 | 3 | 100.75 |
+-----+---------+------------+----------+----------+
And to make the the queries easier for me I made a view "allOrdersSummary" like
-- allOrdersSummary
SELECT
orders.*, SUM(orderDetail.quantity * orderDetail.unitPrice) totalAmount
FROM orders INNER JOIN orderDetail ON orders.ID = orderDetail.orderID
GROUP BY orders.ID;
and I used this view later for my queries, but now I started to get the MAX_JOIN_SIZE error.
So I thought of saving the calculated total order amount along with the orders table ID, clintID, date, totalAmount
and whenever I change something in the orderDeatils
table I update the calculated totalAmount
column in the orders table, I don't know if this is good or bad!
This problem -I don't know if this is considered a problem or not- is encountered many times, for example to know the unread messages of the client making the request I have to do sum(messages) unread from messages where to = ? and isRead = 0
A) should I make another column for calculated totalAmount
in the orders table or it is a normal thing in databases to calculate the totalAmount
from the orderDetails
table every time I need it ?
B) If you recommend making another column in the orders table, what is the best way to update it every time a change happens in the orderDetails
table ? should I update it at the PHP layer whenever I update the orderDetails
table, or this is something that needs a stored procedure ?
Upvotes: 2
Views: 943
Reputation: 142306
You have an "inflate-deflate" problem.
JOIN
the two tables to make a much larger temporary table.GROUP BY
to shrink back to one row per row of the original (orders
) table.This avoids the problem:
SELECT *,
( SELECT SUM(quantity * unitPrice
FROM orderDetail WHERE orderID = orders.ID
) AS totalAmount
FROM orders;
Please let me know how your experience is with this one. It is one of the simplest examples of the inflate-deflate problem.
Upvotes: 1
Reputation: 16686
Yes, it is normal to store pre-calculated values, based on other data in the database, in a database. But not necessarily for the reason you mention. I never had a problem with MAX_JOIN_SIZE
.
The main, and probably only, reason for storing calculated values is speed. So you do it for values that don't change that often and that may be used in queries that use a lot of data and may therefore be too slow if you didn't use them.
For instance: If you want to know the average value of all the orders in your database the query would be a lot faster if you already have the order totals.
Why, and how, you update the values is completely up to you. However you have got to be consistent about it. If you use the MVC pattern it would make sense to integrate it in the controller. Or in simple terms: Whenever a form is submitted that could change one of the values, out of which the pre-calculated value is computed, you need to recompute it.
This is a clear demonstration where 'normalization' is not entirely maintained. It's not really pretty, but sometimes worth it. You could, of course, argue, that the calculated value represents 'new' information, and therefore does not offend against 'normalization'.
Upvotes: 1