Helena
Helena

Reputation: 633

Query to find the FIRST AND SECOND largest value from a group

i have a query like this:

SELECT
 DATEPART(year,some_date),
 DATEPART(month,some_date),
 MAX(some_value) max_value
FROM
 some_table
GROUP BY
    DATEPART(year,some_date),
    DATEPART(month,some_date)

This returns a table with: year, month, the largest value for the month.

I would like to modify the query so that i could obtain: year, month, the largest value for the month, the second largest value for the month in each row.

It seems to me that the well-known solutions like "TOP 2", "NOT IN TOP 1" or a subselect won't work here.

(To be really specific - i am using SQL Server 2008.)

Upvotes: 8

Views: 15174

Answers (6)

skajfes
skajfes

Reputation: 8255

It seems to me that the question calls for a query that would return best, and second best in the same row for each month and year, like so:

month, year, best, second best
...
...

and not two rows for the same month and year containing best and second best value.

This is the solution that I came up with, so if anyone has a simpler way of achieving this, I would like to know.

with ranks as (
    select 
        year(entrydate) as [year], 
        month(entrydate) as [month], 
        views, 
        rank() over (partition by year(entrydate), month(entrydate) order by views desc) as [rank]
    from product
)
select 
    t1.year, 
    t1.month, 
    t1.views as [best], 
    t2.views as [second best]
from ranks t1
    inner join ranks t2
        on t1.year = t2.year
        and t1.month = t2.month
        and t1.rank = 1
        and t2.rank = 2

EDIT: Just out of curiosity I did a bit more testing and ended up with a simpler variation on the Stephanie Page's answer that doesn't use an aditional subquery. And I changed the rank() function to row_number() as it doesn't work when two max values are the same.

with ranks as (
    select 
        year(entrydate) as [year], 
        month(entrydate) as [month], 
        views, 
        row_number() over (partition by year(entrydate), month(entrydate) order by views desc) as [rank]
    from product
)
select 
    t1.year, 
    t1.month, 
    max(case when t1.rank = 1 then t1.views else 0 end) as [best], 
    max(case when t1.rank = 2 then t1.views else 0 end) as [second best]
from 
    ranks t1
where
    t1.rank in (1,2)
group by
    t1.year, t1.month

Upvotes: 6

Stephanie Page
Stephanie Page

Reputation: 3893

to do this without joins ( I'll show the Oracle... you'll just use CASE instead of DECODES)

with ranks as (
        select 
            year(entrydate) as [year], 
            month(entrydate) as [month], 
            views, 
            rank() over (partition by year(entrydate), month(entrydate) order by views desc) as [rank]
        from product
    )
SELECT [year], [month], Max([best]), Max([second best])
 FROM
    ( select 
        t1.year, 
        t1.month, 
        Decode([rank],1,t1.views,0) as [best], 
        Decode([rank],2,t1.views,0)  as [second best]
    from ranks t1
    where t1.rank <= 2 ) x
GROUP BY [year], [month]

Upvotes: 2

marc_s
marc_s

Reputation: 754240

You can use a CTE with the ranking functions in SQL Server 2005 and up:

;WITH TopValues AS
(
   SELECT
     YEAR(some_date) AS 'Year',
     MONTH(some_date) AS 'Month',
     Some_Value,
     ROW_NUMBER() OVER(PARTITION BY YEAR(some_date),MONTH(some_date) 
                       ORDER BY Some_Value DESC) AS 'RowNumber'
  FROM
     dbo.some_table
)
SELECT
    Year, Month, Some_Value
FROM
    TopValues
WHERE
    RowNumber <= 2

This will "partition" (i.e. group) your data by month/year, order inside each group by Some_Value descending (largest first), and then you can select the first two of each group from that CTE.

RANK() works as well (I most often use ROW_NUMBER) - it produces slightly different results, though - really depends on what your needs are.

Upvotes: 1

Paul Sasik
Paul Sasik

Reputation: 81429

This is a bit old-school but TOP and a subquery will work if you use ORDER BY. Try this:

SELECT TOP 2
 DATEPART(year,some_date),
 DATEPART(month,some_date),
 (SELECT MAX(st1.some_value) FROM some_table AS st1 
  WHERE DATEPART(month,some_date) = DATEPART(month,st1.some_date)) AS max_value
FROM
 some_table
GROUP BY
    DATEPART(year,some_date),
    DATEPART(month,some_date)
ORDER BY DATEPART(month,some_date) DESC

That will give you the two rows with the "highest" month values and the added subselect should give you the max from each grouping.

Upvotes: 1

btx
btx

Reputation: 266

Hmmm it's kind of a rig, but you can do this with subqueries... instead of using that max I'd select the some_values which have the matching year & month, row_number()=1 / row_number() = 2 respectively and order by some_value DESC.

The inability to use OFFSET / LIMIT like you can in SQLite is one of my dislikes about SQL Server.

Upvotes: 0

Yves M.
Yves M.

Reputation: 3318

RANK() is maybe the thing you are looking for...

http://msdn.microsoft.com/en-us/library/ms176102.aspx

Upvotes: 3

Related Questions