Michal Stružský
Michal Stružský

Reputation: 61

Get top item for each year

I have a datatable with some records. Using mysql I am able to get a result grouped by a specific period (year) and users and ordered (in descending order) by number of species.

SELECT YEAR(entry_date) AS period, uid AS user, COUNT(DISTINCT pid) AS species
FROM records
WHERE YEAR(entry_date)<YEAR(CURDATE())
GROUP BY period, uid
ORDER by period, species DESC

Please see attached picture of the result. But what if I only want the get the TOP USER (and number of species) for EACH year (the red marked rows)? How can I achieve that?

I am able to handle this later in my php code but it would be nice to have this sortered out already in mysql query.

Thanks for your help!

Upvotes: 1

Views: 300

Answers (1)

GMB
GMB

Reputation: 222512

If you are running MySQL 8.0, you can use RANK() to rank records in years partitions by their count of species, and then filter on the top record per group:

SELECT *
FROM (
    SELECT 
        YEAR(entry_date) AS period, 
        uid AS user, 
        COUNT(DISTINCT pid) AS species,
        RANK() OVER(PARTITION BY YEAR(entry_date) ORDER BY COUNT(DISTINCT pid) DESC) rn
    FROM records
    WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
    GROUP BY period, uid
) t
WHERE rn = 1
ORDER by period

This preserves top ties, if any. Note that uses an index-friendly filter on the dates in the WHERE clause.

In earlier versions, an equivalent option is to filter with a HAVING clause and a correlated subquery:

SELECT 
    YEAR(entry_date) AS period, 
    uid AS user, 
    COUNT(DISTINCT pid) AS species
FROM records r
WHERE entry_date < DATE_FORMAT(CURRENT_DATE, '%Y-01-01')
GROUP BY period, uid
HAVING COUNT(DISTINCT pid) = (
    SELECT COUNT(DISTINCT r1.pid) species1
    FROM records r1
    WHERE YEAR(r1.entry_date) = period
    GROUP BY r1.uid
    ORDER BY species1 DESC
    LIMIT 1
)
ORDER by period

Upvotes: 1

Related Questions