Reputation: 884
Im in the process of developing a large scale application that will contain a few tables with a large dataset. (Potentially 1M+ rows). This application will be a game with multiple users completing tasks at the same time and will be very data intensive.
In this application, data will be aggregated for users statistics. I have came up with two scenarios to achieve my desired affect of calculating all the statistics.
Scenario 1
Maintain a separate table to calculate user statistics. Meaning as a move is processed, the field would increase by one.
Table Statistics (Moves, Origins, Points)
$Moves++;
$Origins++
$Points = $Points + $Points;
Scenario 2
Count and sum the data fields as needed across all data.
Table Moves (Points, Origins)
SUM(Points)
SUM(Origins)
COUNT(Moves)
My question is, which of these two scenarios would be the most efficient on the database driver. It is my belief that Scenario 2 could possibly be more efficient because there will be far less data manipulation, but I'm unsure of the load that these queries may place on the DB.
I am using MySQL 5.5 InnoDB with a UTF8 Charset
Upvotes: 2
Views: 275
Reputation: 9548
The best route will depend on the frequency of reads vs. writes of points, origins and moves. Those frequencies, in turn, will be dependent upon use cases, code style and use (or lack) of caching.
It's difficult to provide a qualified opinion without more details, but consider the fact that a dedicated table brings with it some additional complications in the way of additional writes necessary for each operation and ensuring that those data tallies must always be correct (match the underlying detail data). In light of the additional complication storing logical data elements once rather than twice in a relational database is usually the best course of action.
If you're worried about performance and scaleability you might want to consider a non-relational approach using database platforms like Mongo or DynamoDB.
Upvotes: 1