Reputation: 20590
So let's say we have a table that contains several columns, and we index on TWO COLUMNS: Player 1 and Player 2:
GAME PLAYER 1 PLAYER 2 ...
==== ======== ========
1 John Dan
2 Carl William
3 Carl John
...
An alternate table structure indexing only on a SINGLE COLUMN: Player 1 (and with double the rows):
GAME PLAYER 1 PLAYER 2 ...
==== ======== ========
1 John Dan
1 Dan John
2 Carl William
2 William Carl
3 Carl John
3 John Carl
...
I want to query for all the games John played in. Which table structure/approach is better?
Things to consider:
Upvotes: 0
Views: 59
Reputation: 2880
In this simple case, with an index on both columns you are going to be essentially doubling the table size anyways with the Index.
If there are more than 2 player columns though, the second approach will have a number of rows equal to the number of games to the power of the number of player columns, while the index approach will just have the number of games multiplied by the number of player columns for index space.
Either way, to improve performance, it would be best to replace the names with IDs that reference the players' names in another table.
Also, it seems to me there might be a simpler approach.
GAME PLAYER
==== ========
1 John
1 Dan
2 Carl
2 William
3 Carl
3 John
No need to duplicate the data. Here you can still retrieve all the players that were in game 1, and now you can more easily retrieve the games that each player is in. Here, you only need one index. This could however possibly complicate some other queries.
Upvotes: 2
Reputation: 2761
It's better to normalize the table. Have a table for players with id and put the foreign key namely player_id on the game table. It scales much better specially if you set index on the player_ids . It will also make future changes to users much easier.
Upvotes: 2