blogsdon
blogsdon

Reputation: 125

SQL query to return top N rows per ID across a range of IDs

Lets say I have a table containing several hundred million rows that looks something like this:

memID | foo  | bar  | foobar
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah
.
.
.
10001 | blah | blah | blah
10001 | blah | blah | blah

I need a query that will return the top N rows for each memID in a range of member IDs. For example, if N = 3 and the range is 0-2 it should return

memID | foo  | bar  | foobar
1     | blah | blah | blah
1     | blah | blah | blah
1     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah
2     | blah | blah | blah  

I've considered a couple approaches, first creating a massive

SELECT TOP (3) *
FROM table
WHERE memID = 0
UNION ALL
SELECT TOP (3) *
FROM table
WHERE memID = 1
.
.
.

query in code. This isn't really a realistic option for probably obvious reasons.

The second approach is to create a temporary table and loop over the range of memIDs inserting the TOP 3 for each memID into that temporary table.

WHILE @MemID < 10000 AND @MemID > 0
  BEGIN
    INSERT INTO tmp_Table
    SELECT TOP (3) *
     FROM table
     WHERE memID = @MemID

    SET @MemID = @MemID + 1
    END

This works, but I'm wondering if there is a more elegant single query solution that I'm missing.

Cadaeic gave me an answer that works without tinkering, but thank you to everyone that suggested analytics, it looks like I have some serious reading to do.

Upvotes: 2

Views: 6674

Answers (5)

Nicholas Mancuso
Nicholas Mancuso

Reputation: 11877

SELECT * FROM Member m
Join ( Select TOP(3) * From Table Order By Table.Id) as t 
    On t.MemberId = m.MemberId
Where m BETWEEN 0 and 10000

should do the trick

Upvotes: 0

Taylor Gerring
Taylor Gerring

Reputation: 1835

declare @startID int, @endID int, @rowsEach int
select @startID = 0, @endID = 2, @rowsEach = 3


select *
from
(
    select memID, foo, bar, foobar, row_number() over (partition by dense_rank order by dense_rank) [rank_row]
    from
    (
        select memID, foo, bar, foobar, dense_rank() over (order by memID) [dense_rank]
        from #test
        where memID between @startID and @endID
    ) a
) b
where rank_row <= @rowsEach

The result:

memID       foo  bar  foobar rank_row
----------- ---- ---- ------ --------------------
1           blah blah blah   1
1           blah blah blah   2
1           blah blah blah   3
2           blah blah blah   1
2           blah blah blah   2
2           blah blah blah   3

And here's the set-up code if you'd like to test locally:

create table #test
(
      memID     int not null
    , foo       char(4) not null
    , bar       char(4) not null
    , foobar    char(4) not null
)

insert into #test (memID, foo, bar, foobar)
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 1, 'blah', 'blah', 'blah'
union all
select 2, 'blah', 'blah', 'blah'
union all
select 2, 'blah', 'blah', 'blah'
union all
select 2, 'blah', 'blah', 'blah'
union all
select 2, 'blah', 'blah', 'blah'
union all
select 10001, 'blah', 'blah', 'blah'
union all
select 10001, 'blah', 'blah', 'blah'
union all
select 10001, 'blah', 'blah', 'blah'

Upvotes: 6

Brian
Brian

Reputation: 13571

SQL> select ename,sal,
  2   row_number()
  3     over (order by sal desc)rn,
  4   rank()
  5     over (order by sal desc)rnk,
  6   dense_rank()
  7     over (order by sal desc)drnk
  8   from emp
  9  order by sal desc
 10  /

ENAME    SAL   RN   RNK   DRNK
-----   ----   --   ---   ----
 KING   5000    1     1      1
 FORD   3000    2     2      2
SCOTT   3000    3     2      2
JONES   2975    4     4      3
BLAKE   2850    5     5      4
CLARK   2450    6     6      5

Upvotes: 2

Barry
Barry

Reputation: 492

Use analytics. I didn't test this but should be close:

SELECT memID, foo, bar, foobar 
FROM  (
       SELECT memID, foo, bar, foobar, 
              RANK() OVER (PARTITION BY memID ORDER BY memID) AS 'nRank'
       FROM   table
       WHERE  memID BETWEEN 0 AND 2)
WHERE  nRank <= 3

Upvotes: 1

cmsjr
cmsjr

Reputation: 59175

If you are using SQL Server 2005 or 2008 you might want to investigate the Ranking Functions

Upvotes: 1

Related Questions