Reputation: 668
I am using the following MySQL query in a PHP script on a database that contains over 370,000,000 (yes, three hundred and seventy million) rows. I know that it is extremely resource intensive and it takes ages to run this one query. Does anyone know how I can either optimise the query or get the information in another way that's quicker?
Table Info:
games | longint, unsigned, Primary Key win | bit(1) loss | bit(1)
Query:
SELECT MID(game
,{$len},1) ASmove
, COUNT(*) ASgames
, SUM(win
) ASwins
, SUM(loss
) ASlosses
FROMgames
WHEREgame
>{$something} ANDgame
<{$something_else} GROUP BYmove
Thanks in advance for your help!
Upvotes: 2
Views: 353
Reputation: 5143
Straight off the bat, I would stop using the MID() query in both the SELECT expression and GROUP BY. Depening on your query conditions, MySQL will not necessarily cache that inside a single expression whilst parsing, so at least try this:
SELECT MID(game,{$len},1) AS move,
COUNT(*) AS games,
SUM(win) AS wins,
SUM(loss) AS losses
FROM games WHERE game LIKE '{$game}%' GROUP BY move;
Not the biggest change in the world, but it should make a small difference. Other than that though, I second that the only real way to optimise this short of changing the way you store the data is to precalculate these values and increment them when a game finishes.
Upvotes: 1
Reputation: 6878
The mid() function is what is killing this query. MySQL has to create a temp table in memory to deal with the mid() function and do filesort on that table because of the group by.
I'm assuming that $game is the type of game. (checkers, chess, tic tac toe)
I would hang another table off for the type of game. This allows your group to make use of an index which would be much faster.
I suggest something like:
[game]
game bigint unsigned
win bit
loss bit
game_type_id bigint unsigned
[game_type]
game_type_id bigint unsigned
game_type_desc varchar(13)
Be careful with alters statements on a table this large. Always make a backup before you issue an alter.
Upvotes: 0
Reputation: 694
You can "buy speed" by sacrificing storage space or onserve storage space but got worse performance. As your problem is speed, you'll need some precalculations. And yes, some profiling of the query.
BTW, the "big ones" used to have different configs (different hardware and settings) for OLTP (serving the actual transactions in real time) and DW (analysing large amounts of data).
Upvotes: 0
Reputation: 946
Sounds to me like you could denormalize this and create a "moves" table, recording stats per "move", not only per "game".
Upvotes: 0
Reputation: 229344
If you're read heavy, consider keep and maintain an aggregated table over the data you commonly query.
Upvotes: 0
Reputation: 1497
The only suggestion I can make is to use a table to precalculate all counts, and sums for each game and update it when table game changes using a trigger.
Upvotes: 5