diplosaurus
diplosaurus

Reputation: 2588

MySQL COUNT occurrences from several columns

I have the following table of matches of a 4 player game called games.

+---------+---------+---------+---------+---------+
| game_id | player1 | player2 | player3 | player4 |
+---------+---------+---------+---------+---------+
|    1001 | john    | dave    | NULL    | NULL    |
|    1002 | dave    | john    | mike    | tim     |
|    1003 | mike    | john    | dave    | NULL    |
|    1004 | tim     | dave    | NULL    | NULL    |
+---------+---------+---------+---------+---------+

There are two questions I want to be able to answer:

  1. Who played in the most games? (Dave)
  2. What pair of players played the most games together? (John & Dave)

For #1 I tried to adapt the answer I found here: mySQL query to find the most repeated value but it only seems to be able to answer the question for a single column. Meaning I could learn who was player1 the most, but not who played in the most games as any player:

SELECT player1 p1, COUNT(*) p1 FROM games
GROUP BY p1    
ORDER BY p1 DESC;

Is there a way to join these columns together or would I have to handle this in application code?

Not sure where to start for #2. I'm wondering if my table structure should instead consolidate players to a single column:

+----+---------+--------+
| id | game_id | player |
+----+---------+--------+
|  1 |    1001 | john   |
|  2 |    1001 | dave   |
|  3 |    1002 | john   |
|  4 |    1002 | dave   |
|  5 |    1002 | mike   |
|  6 |    1002 | tim    |
+----+---------+--------+

Upvotes: 4

Views: 556

Answers (2)

revo
revo

Reputation: 48711

Your best bet is normalizing database. This is a many-to-many relationship and needs a linked table to connect a game to its corresponding players. Then computations would be much more easier. Nevertheless, you could use a derived table for question one that unites all columns into one:

SELECT `player`,
       COUNT(*) as `count`
FROM
    (
        SELECT `player1` `player`
        FROM `games`
        UNION ALL
        SELECT `player2` `player`
        FROM `games`
        UNION ALL
        SELECT `player3` `player`
        FROM `games`
        UNION ALL
        SELECT `player4` `player`
        FROM `games`
    ) p
GROUP BY `player` HAVING `player` IS NOT NULL
ORDER BY `count` DESC

See live demo here

For the second question you have to have an inner join on derived table:

SELECT `p`.`player`,
       `p2`.`player`,
       count(*) AS count
FROM
    (
        SELECT `game_id`, `player1` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player2` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player3` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player4` `player`
        FROM `games`
    ) p
INNER JOIN
    (
        SELECT `game_id`, `player1` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player2` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player3` `player`
        FROM `games`
        UNION ALL
        SELECT `game_id`, `player4` `player`
        FROM `games`
    ) p2
ON `p`.`game_id` = `p2`.`game_id` AND `p`.`player` < `p2`.`player`
WHERE `p`.`player` IS NOT NULL AND `p2`.`player` IS NOT NULL
GROUP BY `p`.`player`, `p2`.`player`
ORDER BY `count` DESC

See live demo here

Upvotes: 4

M Khalid Junaid
M Khalid Junaid

Reputation: 64466

I would start with restructuring your design and introduce 3 tables

1) Player which will have player data and their unique ids

CREATE TABLE players
    (`id` int, `name` varchar(255))
;
INSERT INTO players
    (`id`,  `name`)
VALUES
    (1, 'john'),
    (2, 'dave'),
    (3, 'mike'),
    (4, 'tim');

2) Games which will have game data and their unique ids

CREATE TABLE games
    (`id` int, `name` varchar(25))
;

INSERT INTO games
    (`id`, `name`)
VALUES
    (1001, 'G1'),
    (1002, 'G2'),
    (1003, 'G3'),
    (1004, 'G4');

3) player_games to relate these 2 entities as many to many relationship via junction table which will hold game id and player id like as per your sample data

CREATE TABLE player_games
    (`game_id` int, `player_id` int(11))
;

INSERT INTO player_games
    (`game_id`, `player_id`)
VALUES
    (1001, 1),
    (1001, 2),
    (1002, 1),
    (1002, 2),
    (1002, 3),
    (1002, 4),
    (1003, 3),
    (1003, 1),
    (1003, 2),
    (1004, 4),
    (1004, 2)
;

For Who played in the most games? Its dave not john as per your sample data set who played 4 games

select t.games_played,group_concat(t.name) players
from (
  select p.name,
  count(distinct pg.game_id) games_played
  from player_games pg
  join players p on p.id = pg.player_id
  group by p.name 
) t
group by games_played
order by games_played desc
limit 1

For above query there can be a chance that morethan one players have played most games like dave played 4 games and tim also played 4 games so both should be included

Demo

For What pair of players played the most games together? (John & Dave)

select t.games_played,group_concat(t.player_name) players
from (
  select group_concat(distinct pg.game_id),
  concat(least(p.name, p1.name), ' ', greatest(p.name, p1.name)) player_name,
  count(distinct pg.game_id) games_played
  from player_games pg
  join player_games pg1 on pg.game_id = pg1.game_id
                       and pg.player_id <> pg1.player_id
  join players p on p.id = pg.player_id
  join players p1 on p1.id = pg1.player_id
  group by player_name
) t
group by games_played
order by games_played desc
limit 1;

In above query i have self joined player_games table to get the combination of players against each game and then grouped data for each unique pair , Again followed same logic to handel that there can be a chance that morethan one pair of players have played most games

Demo

Upvotes: 2

Related Questions