Reputation: 5010
I struggling with a problem I have in TSQL, I need to get the top 10 results for each user from a table that might contain more than 10 results.
My natural (and procedurally minded) approach is "for each user in table T select the top 10 results ordered by date".
Each time I try to formulate the question in my mind in a set based approach, I keep running into the term "foreach".
Is it possible to do something like this:
SELECT *
FROM table AS t1
INNER JOIN (
SELECT TOP 10 *
FROM table AS t2
WHERE t2.id = t1.id
ORDER BY date DESC
)
Or even
SELECT ( SELECT TOP 10 *
FROM table AS t2
WHERE t2.id = t1.id
ORDER BY date )
FROM table AS t1
Or is there another solution to this using temp tables that I should think about?
EDIT:
Just to be perfectly clear - I need to the top 10 results for each user in the table, e.g. 10 * N where N = number of users.
EDIT:
In response to a suggestion made by RBarryYoung, I'm having an issue, which is best demonstrated with code:
CREATE TABLE #temp (id INT, date DATETIME)
INSERT INTO #temp (id, date) VALUES (1, GETDATE())
INSERT INTO #temp (id, date) VALUES (1, GETDATE())
SELECT *
FROM #temp AS t1
CROSS APPLY (
SELECT TOP 1 *
FROM #temp AS t2
WHERE t2.id = t1.id
ORDER BY t2.date DESC
) AS t2
DROP TABLE #temp
Running this, you can see that this doesn't limit the results to the TOP 1... Am I doing something wrong here?
EDIT:
It seems my last example provided a bit of confusion. Here is an example showing what I want to do:
CREATE TABLE #temp (id INT, date DATETIME)
INSERT INTO #temp (id, date) VALUES (1, GETDATE())
INSERT INTO #temp (id, date) VALUES (1, GETDATE())
INSERT INTO #temp (id, date) VALUES (1, GETDATE())
INSERT INTO #temp (id, date) VALUES (2, GETDATE())
SELECT *
FROM #temp AS t1
CROSS APPLY
(
SELECT TOP 2 *
FROM #temp AS t2
WHERE t2.id = t1.id
ORDER BY t2.date DESC
) AS t2
DROP TABLE #temp
This outputs:
1 2009-08-26 09:05:56.570 1 2009-08-26 09:05:56.583
1 2009-08-26 09:05:56.570 1 2009-08-26 09:05:56.583
1 2009-08-26 09:05:56.583 1 2009-08-26 09:05:56.583
1 2009-08-26 09:05:56.583 1 2009-08-26 09:05:56.583
1 2009-08-26 09:05:56.583 1 2009-08-26 09:05:56.583
1 2009-08-26 09:05:56.583 1 2009-08-26 09:05:56.583
2 2009-08-26 09:05:56.583 2 2009-08-26 09:05:56.583
If I use distinct:
SELECT DISTINCT t1.id
FROM #temp AS t1
CROSS APPLY
(
SELECT TOP 2 *
FROM #temp AS t2
WHERE t2.id = t1.id
ORDER BY t2.date DESC
) AS t2
I get
1
2
I need
1
1
2
Does anyone know if this is possible?
EDIT:
The following code will do this
WITH RowTable AS
(
SELECT
id, date, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) AS RowNum
FROM #temp
)
SELECT *
FROM RowTable
WHERE RowNum <= 2;
I posted in the comments, but there is no code formatting, so it doesn't look very nice.
Upvotes: 1
Views: 1752
Reputation: 56755
Yes, there are several differet good ways to do this in 2005 and 2008. The one most similar to what you are already trying is with CROSS APPLY:
SELECT T2.*
FROM (
SELECT DISTINCT ID FROM table
) AS t1
CROSS APPLY (
SELECT TOP 10 *
FROM table AS t2
WHERE t2.id = t1.id
ORDER BY date DESC
) AS t2
ORDER BY T2.id, date DESC
This then returns the ten most recent entries in [table] (or as many as exist, up to 10), for each distinct [id]. Asumming that [id] corresponds to a user, then this should be exactly what you are asking for.
(edit: slight changes because I did not take into account that T1 and T2 were the same tables and thus there will be multiple duplicate t1.IDs matching multiple duplicate T2.ids.)
Upvotes: 4
Reputation:
select userid, foo, row_number() over (partition by userid order by foo) as rownum from table where rownum <= 10
Upvotes: 2
Reputation: 562861
Here's a trick I use to do this "top-N-per-group" type of query:
SELECT t1.id
FROM table t1 LEFT OUTER JOIN table t2
ON (t1.user_id = t2.user_id AND (t1.date > t2.date
OR t1.date = t2.date AND t1.id > t2.id))
GROUP BY t1.id
HAVING COUNT(*) < 10
ORDER BY t1.user_id, COALESCE(COUNT(*), 0);
Upvotes: 0
Reputation: 3294
I believe this SO question will answer your question. It's not answering exactly the same question, but I think the solution will work for you too.
Upvotes: 0
Reputation: 17739
It is possible, however using nested queries will be slower.
The following will also find the results you are looking for:
SELECT TOP 10 *
FROM table as t1
INNER JOIN table as t2
ON t1.id = t2.id
ORDER BY date DESC
Upvotes: 0