Reputation: 2588
I have the following table of matches of a 4 player game called games
.
+---------+---------+---------+---------+---------+
| game_id | player1 | player2 | player3 | player4 |
+---------+---------+---------+---------+---------+
| 1001 | john | dave | NULL | NULL |
| 1002 | dave | john | mike | tim |
| 1003 | mike | john | dave | NULL |
| 1004 | tim | dave | NULL | NULL |
+---------+---------+---------+---------+---------+
There are two questions I want to be able to answer:
For #1 I tried to adapt the answer I found here: mySQL query to find the most repeated value but it only seems to be able to answer the question for a single column. Meaning I could learn who was player1
the most, but not who played in the most games as any player:
SELECT player1 p1, COUNT(*) p1 FROM games
GROUP BY p1
ORDER BY p1 DESC;
Is there a way to join these columns together or would I have to handle this in application code?
Not sure where to start for #2. I'm wondering if my table structure should instead consolidate players to a single column:
+----+---------+--------+
| id | game_id | player |
+----+---------+--------+
| 1 | 1001 | john |
| 2 | 1001 | dave |
| 3 | 1002 | john |
| 4 | 1002 | dave |
| 5 | 1002 | mike |
| 6 | 1002 | tim |
+----+---------+--------+
Upvotes: 4
Views: 556
Reputation: 48711
Your best bet is normalizing database. This is a many-to-many relationship and needs a linked table to connect a game to its corresponding players. Then computations would be much more easier. Nevertheless, you could use a derived table for question one that unites all columns into one:
SELECT `player`,
COUNT(*) as `count`
FROM
(
SELECT `player1` `player`
FROM `games`
UNION ALL
SELECT `player2` `player`
FROM `games`
UNION ALL
SELECT `player3` `player`
FROM `games`
UNION ALL
SELECT `player4` `player`
FROM `games`
) p
GROUP BY `player` HAVING `player` IS NOT NULL
ORDER BY `count` DESC
See live demo here
For the second question you have to have an inner join on derived table:
SELECT `p`.`player`,
`p2`.`player`,
count(*) AS count
FROM
(
SELECT `game_id`, `player1` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player2` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player3` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player4` `player`
FROM `games`
) p
INNER JOIN
(
SELECT `game_id`, `player1` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player2` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player3` `player`
FROM `games`
UNION ALL
SELECT `game_id`, `player4` `player`
FROM `games`
) p2
ON `p`.`game_id` = `p2`.`game_id` AND `p`.`player` < `p2`.`player`
WHERE `p`.`player` IS NOT NULL AND `p2`.`player` IS NOT NULL
GROUP BY `p`.`player`, `p2`.`player`
ORDER BY `count` DESC
See live demo here
Upvotes: 4
Reputation: 64466
I would start with restructuring your design and introduce 3 tables
1) Player which will have player data and their unique ids
CREATE TABLE players
(`id` int, `name` varchar(255))
;
INSERT INTO players
(`id`, `name`)
VALUES
(1, 'john'),
(2, 'dave'),
(3, 'mike'),
(4, 'tim');
2) Games which will have game data and their unique ids
CREATE TABLE games
(`id` int, `name` varchar(25))
;
INSERT INTO games
(`id`, `name`)
VALUES
(1001, 'G1'),
(1002, 'G2'),
(1003, 'G3'),
(1004, 'G4');
3) player_games to relate these 2 entities as many to many relationship via junction table which will hold game id and player id like as per your sample data
CREATE TABLE player_games
(`game_id` int, `player_id` int(11))
;
INSERT INTO player_games
(`game_id`, `player_id`)
VALUES
(1001, 1),
(1001, 2),
(1002, 1),
(1002, 2),
(1002, 3),
(1002, 4),
(1003, 3),
(1003, 1),
(1003, 2),
(1004, 4),
(1004, 2)
;
For Who played in the most games? Its dave not john as per your sample data set who played 4 games
select t.games_played,group_concat(t.name) players
from (
select p.name,
count(distinct pg.game_id) games_played
from player_games pg
join players p on p.id = pg.player_id
group by p.name
) t
group by games_played
order by games_played desc
limit 1
For above query there can be a chance that morethan one players have played most games like dave played 4 games and tim also played 4 games so both should be included
For What pair of players played the most games together? (John & Dave)
select t.games_played,group_concat(t.player_name) players
from (
select group_concat(distinct pg.game_id),
concat(least(p.name, p1.name), ' ', greatest(p.name, p1.name)) player_name,
count(distinct pg.game_id) games_played
from player_games pg
join player_games pg1 on pg.game_id = pg1.game_id
and pg.player_id <> pg1.player_id
join players p on p.id = pg.player_id
join players p1 on p1.id = pg1.player_id
group by player_name
) t
group by games_played
order by games_played desc
limit 1;
In above query i have self joined player_games table to get the combination of players against each game and then grouped data for each unique pair , Again followed same logic to handel that there can be a chance that morethan one pair of players have played most games
Upvotes: 2