John Doe
John Doe

Reputation: 23

Select top 2 scorers from each combination of 3 columns in MySQL

I have following tables and data:

player_scores
+----+-----------+---------------------+-------+
| id | player_id |     created_at      | score |
+----+-----------+---------------------+-------+
|  1 |         1 | 2020-01-01 01:00:00 |    20 |
|  2 |         1 | 2020-01-02 01:00:00 |    30 |
|  3 |         2 | 2020-01-01 01:00:00 |    20 |
|  4 |         3 | 2020-01-01 01:00:00 |    20 |
|  5 |         4 | 2020-05-01 01:00:00 |    40 |
|  6 |         5 | 2020-01-02 01:00:00 |    20 |
|  7 |         6 | 2020-01-01 01:00:00 |    20 |
|  8 |         7 | 2020-01-03 01:00:00 |    20 |
|  9 |         1 | 2020-03-01 01:00:00 |    20 |
+----+-----------+---------------------+-------+

players
+----+---------+-------------+----------+---------------------+---------+
| id | city_id | category_id | group_id |     created_at      |  name   |
+----+---------+-------------+----------+---------------------+---------+
|  1 |       1 |           1 |        1 | 2020-01-01 01:00:00 | Player1 |
|  2 |       1 |           2 |        1 | 2020-01-02 01:00:00 | Player2 |
|  3 |       2 |           2 |        1 | 2020-01-01 01:00:00 | Player3 |
|  4 |       2 |           1 |        1 | 2020-05-01 01:00:00 | Player4 |
|  5 |       3 |           1 |        1 | 2020-01-02 01:00:00 | Player5 |
|  6 |       4 |           2 |        1 | 2020-01-01 01:00:00 | Player6 |
|  7 |       3 |           1 |        1 | 2020-01-01 01:00:00 | Player7 |
|  8 |       4 |           2 |        1 | 2020-01-01 01:00:00 | Player8 |
+----+---------+-------------+----------+---------------------+---------+

cities
+----+------------+------------+
| id | country_id |    name    |
+----+------------+------------+
|  1 |          1 | London     |
|  2 |          2 | Sydney     |
|  3 |          2 | Melbourne  |
|  4 |          3 | Toronto    |
+----+------------+------------+

countries
+----+-----------+
| id |   name    |
+----+-----------+
|  1 | England   |
|  2 | Australia |
|  3 | Canada    |
+----+-----------+

categories
+----+------------+
| id |    name    |
+----+------------+
|  1 | Category 1 |
|  2 | Category 2 |
+----+------------+

groups
+----+---------+
| id |  name   |
+----+---------+
|  1 | Group 1 |
|  2 | Group 2 |
+----+---------+

SQL code to create tables and data:

CREATE TABLE players
(
    id          INT UNSIGNED auto_increment PRIMARY KEY,
    city_id     INT UNSIGNED NOT NULL,
    category_id INT UNSIGNED NOT NULL,
    group_id    INT UNSIGNED NOT NULL,
    created_at  DATETIME NOT NULL,
    name        VARCHAR(255) NOT NULL
);

CREATE TABLE player_scores
(
    id         INT UNSIGNED auto_increment PRIMARY KEY,
    player_id  INT UNSIGNED NOT NULL,
    created_at DATETIME NOT NULL,
    score      INT(10) NOT NULL
);

CREATE TABLE cities
(
    id         INT UNSIGNED auto_increment PRIMARY KEY,
    country_id INT UNSIGNED NOT NULL,
    name       VARCHAR(255) NOT NULL
);

CREATE TABLE countries
(
    id   INT UNSIGNED auto_increment PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

CREATE TABLE categories
(
    id   INT UNSIGNED auto_increment PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

CREATE TABLE `groups`
(
    id   INT UNSIGNED auto_increment PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

INSERT INTO players (id, city_id, category_id, group_id, created_at, name) VALUES (1, 1, 1, 1, '2020-01-01 01:00:00', 'Player1'),(2, 1, 2, 1, '2020-01-02 01:00:00', 'Player2'),(3, 2, 2, 1, '2020-01-01 01:00:00', 'Player3'),(4, 2, 1, 1, '2020-05-01 01:00:00', 'Player4'),(5, 3, 1, 1, '2020-01-02 01:00:00', 'Player5'),(6, 4, 2, 1, '2020-01-01 01:00:00', 'Player6'),(7, 3, 1, 1, '2020-01-01 01:00:00', 'Player7'),(8, 4, 2, 1, '2020-01-01 01:00:00', 'Player8');
INSERT INTO player_scores (id, player_id, created_at, score) VALUES (1, 1, '2020-01-01 01:00:00', 20), (2, 1, '2020-01-02 01:00:00', 30),(3, 2, '2020-01-01 01:00:00', 20),(4, 3, '2020-01-01 01:00:00', 20),(5, 4, '2020-05-01 01:00:00', 40),(6, 5, '2020-01-02 01:00:00', 20),(7, 6, '2020-01-01 01:00:00', 20),(8, 7, '2020-01-03 01:00:00', 20),(9, 1, '2020-03-01 01:00:00', 20);
INSERT INTO cities (id, country_id, name) VALUES (1,1,'London'), (2,2,'Sydney'), (3,2,'Melbourne'), (4,3,'Toronto');
INSERT INTO countries (id, name) VALUES (1,'England'),(2,'Australia'),(3,'Canada');
INSERT INTO categories (id, name) VALUES (1,'Category 1'),(2,'Category 2');
INSERT INTO `groups` (id, name) VALUES (1,'Group 1'),(2,'Group 2');

Relationship between 'players' and 'player_scores' is one-to-many. Also, for some players there might be no scores at all.

I have to return a one list of top 2 scorers from each combination of country, category and group. If there are no scores at all for a combination then no scorers are selected for that combination. If there is only one scorer within a combination then only one scorer is selected. If a player does not have any scores yet then it will not be selected.

If 2 or more players have the same scores within the combination, the earliest created player (created_at field within 'players' table) should be selected.

I use MySQL 5.7, therefore I cannot use window functions !

So, the result from the testing data above should be:

+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+
| player.id | country.name | category.name | group.name |  player.created_at  | player_scores.score | player_scores.created_at |
+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+
|         1 | England      | Category 1    | Group 1    | 2020-01-01 01:00:00 |                  20 | 2020-03-01 01:00:00      |

|         2 | England      | Category 2    | Group 1    | 2020-01-02 01:00:00 |                  20 | 2020-01-01 01:00:00      |

|         3 | Australia    | Category 2    | Group 1    | 2020-01-01 01:00:00 |                  20 | 2020-01-01 01:00:00      |

|         4 | Australia    | Category 1    | Group 1    | 2020-05-01 01:00:00 |                  40 | 2020-05-01 01:00:00      |
|         7 | Australia    | Category 1    | Group 1    | 2020-01-01 01:00:00 |                  20 | 2020-01-03 01:00:00      |

|         6 | Canada       | Category 2    | Group 1    | 2020-01-01 01:00:00 |                  20 | 2020-01-01 01:00:00      |
+-----------+--------------+---------------+------------+---------------------+---------------------+--------------------------+

So far, I have this query, but obviously it is far away from solution. I tried and searched for some hints, but could not find any so far:

SELECT players.*, player_scores.*, cities.*, countries.*, categories.*, groups.*
FROM players
LEFT JOIN cities
    ON players.city_id = cities.id
LEFT JOIN countries
    ON cities.country_id = country.id
LEFT JOIN categories
    ON players.category_id = categories.id
LEFT JOIN groups
    ON players.group_id = groups.id
LEFT JOIN player_scores
    ON player_scores.player_id = players.id
    AND player_scores.id IN (
        SELECT MAX(ps.id)
        FROM player_scores AS ps
        JOIN players AS p
            ON p.id = ps.player_id
        GROUP BY p.id
    )
INNER JOIN (
    SELECT DISTINCT countries.id, groups.id, categories.id
    FROM players
    LEFT JOIN cities
        ON players.city_id = cities.id
    LEFT JOIN countries
        ON cities.country_id = country.id
    LEFT JOIN groups
        ON players.group_id = groups.id
    LEFT JOIN categories
        ON players.category_id = categories.id
    INNER JOIN player_scores
        ON player_scores.player_id = players.id
    WHERE player_scores.id IN (
        SELECT MAX(ps.id)
        FROM player_scores AS ps
        JOIN players AS p
            ON p.id = ps.player_id
        GROUP BY p.id
    )
    GROUP BY countries.id, categories.id, groups.id
    HAVING MAX(player_scores.score) > 0
) players2 
    ON countries.id = players2.country_id 
    AND categories.id = players2.category_id 
    AND groups.id = players2.group_id;

Any help will be highly appreciated.

UPDATE: Provided testing data and result table.

Upvotes: 2

Views: 170

Answers (1)

Strawberry
Strawberry

Reputation: 33935

To recap, am I right in thinking that this is the intermediate result, from which we have to select a subset of results based upon the stated criteria?

SELECT p.name 
     , s.score
     , c.name city
     , x.name country
     , y.name category
     , g.name player_group
     , p.created_at
  FROM players p
  JOIN player_scores s
    ON s.player_id = p.id
  JOIN cities c
    ON c.id = p.city_id
  JOIN countries x
    ON x.id = c.country_id
  JOIN categories y
    ON y.id = p.category_id
  JOIN groups g
    ON g.id = p.group_id;
+---------+-------+-----------+-----------+------------+--------------+---------------------+
| name    | score | city      | country   | category   | player_group | created_at          |
+---------+-------+-----------+-----------+------------+--------------+---------------------+
| Player1 |    20 | London    | England   | Category 1 | Group 1      | 2020-01-01 01:00:00 |
| Player1 |    30 | London    | England   | Category 1 | Group 1      | 2020-01-01 01:00:00 |
| Player4 |    40 | Sydney    | Australia | Category 1 | Group 1      | 2020-05-01 01:00:00 |
| Player5 |    20 | Melbourne | Australia | Category 1 | Group 1      | 2020-01-02 01:00:00 |
| Player7 |    20 | Melbourne | Australia | Category 1 | Group 1      | 2020-01-01 01:00:00 |
| Player1 |    20 | London    | England   | Category 1 | Group 1      | 2020-01-01 01:00:00 |
| Player2 |    20 | London    | England   | Category 2 | Group 1      | 2020-01-02 01:00:00 |
| Player3 |    20 | Sydney    | Australia | Category 2 | Group 1      | 2020-01-01 01:00:00 |
| Player6 |    20 | Toronto   | Canada    | Category 2 | Group 1      | 2020-01-01 01:00:00 |
+---------+-------+-----------+-----------+------------+--------------+---------------------+

Upvotes: 1

Related Questions