Reputation: 19515
I want to return top 10 records from each section in one query. Can anyone help with how to do it? Section is one of the columns in the table.
Database is SQL Server 2005. I want to return the top 10 by date entered. Sections are business, local, and feature. For one particular date I want only the top (10) business rows (most recent entry), the top (10) local rows, and the top (10) features.
Upvotes: 295
Views: 698089
Reputation: 3262
This is a slight refinement (syntax only, swapping out a CASE statement for a GREATEST function) to Vadim Loboda's answer above.
DECLARE @N INT = 3; --the top "N" items to retrieve
WITH items(dt, category) AS
(
SELECT '2000-01-01', 'group1'
UNION SELECT '2000-01-02', 'group1'
UNION SELECT '2000-01-03', 'group1'
UNION SELECT '2000-01-04', 'group1'
UNION SELECT '2000-01-05', 'group1'
UNION SELECT '1999-01-01', 'group2'
UNION SELECT '1999-01-02', 'group2'
UNION SELECT '1999-01-03', 'group2'
UNION SELECT '1999-01-04', 'group2'
UNION SELECT '2000-01-01', 'group3'
)
SELECT TOP 1 WITH TIES
category,
dt
FROM items
ORDER BY GREATEST(0, ROW_NUMBER() OVER (PARTITION BY category ORDER BY dt ASC) - @N)
ROW_NUMBER is applied to each (partitioned by whatever we are grouping by), then subtract N from it so that it acquires a negative value for the first items, then the GREATEST function will instead return zero. All the rows with zero will then "tie" and TOP WITH TIES will give you just those rows.
Upvotes: 0
Reputation: 9986
Note: I know the OP has only 3 groups but this is a known general problem for many developers and there's no really good solution in SQL. So let me show you another way.
You can write it as one query. That is, formally one query, but it contans either subqueries or a self-JOIN which makes it in fact multiple operations under the hood. So you might as well just select each group individually.
If you want a performant solution, you need to work a bit more. Let's say you have 100 employees, you have 26 buildings from A to Z, people move around them (enter/leave), and you need the last 5 events for every building.
EVENT_ID EVENT_TIME EMPOYEE_ID EVENT_CODE BUILDING
-------------------------------------------------------------------------
883691 2023-03-29 11:00:00 92 enter A
883690 2023-03-29 11:00:21 78 enter C
883689 2023-03-29 11:00:25 58 enter A
883688 2023-03-29 11:02:10 22 leave H
883687 2023-03-29 11:31:42 73 leave P
...
...
You want to avoid 26 queries.
Here's what you can do:
For theorists, this is an anti-pattern. But the first query will have almost the same performance as one of the single-building ones, and probably brings you most of what you need; a few buildings will be missing, depending on how the employees normally move. Then, you may need 5 more queries for those buildings, and some merging on the application level.
In short: get a result that's almost complete, then make it complete.
So if you need performance, this is one way. If you need clarity of the business logic, well, choose any other answer. This one is scary. But when you go for speed, you often need scary techniques.
Upvotes: 1
Reputation: 3896
SELECT r.*
FROM
(
SELECT
r.*,
ROW_NUMBER() OVER(PARTITION BY r.[SectionID]
ORDER BY r.[DateEntered] DESC) rn
FROM [Records] r
) r
WHERE r.rn <= 10
ORDER BY r.[DateEntered] DESC
Upvotes: 48
Reputation: 3111
If we use SQL Server >= 2005, then we can solve the task with one select only:
declare @t table (
Id int ,
Section int,
Moment date
);
insert into @t values
( 1 , 1 , '2014-01-01'),
( 2 , 1 , '2014-01-02'),
( 3 , 1 , '2014-01-03'),
( 4 , 1 , '2014-01-04'),
( 5 , 1 , '2014-01-05'),
( 6 , 2 , '2014-02-06'),
( 7 , 2 , '2014-02-07'),
( 8 , 2 , '2014-02-08'),
( 9 , 2 , '2014-02-09'),
( 10 , 2 , '2014-02-10'),
( 11 , 3 , '2014-03-11'),
( 12 , 3 , '2014-03-12'),
( 13 , 3 , '2014-03-13'),
( 14 , 3 , '2014-03-14'),
( 15 , 3 , '2014-03-15');
-- TWO earliest records in each Section
select top 1 with ties
Id, Section, Moment
from
@t
order by
case
when row_number() over(partition by Section order by Moment) <= 2
then 0
else 1
end;
-- THREE earliest records in each Section
select top 1 with ties
Id, Section, Moment
from
@t
order by
case
when row_number() over(partition by Section order by Moment) <= 3
then 0
else 1
end;
-- three LATEST records in each Section
select top 1 with ties
Id, Section, Moment
from
@t
order by
case
when row_number() over(partition by Section order by Moment desc) <= 3
then 0
else 1
end;
Upvotes: 19
Reputation: 221370
While the question was about SQL Server 2005, most people have moved on and if they do find this question, what could be the preferred answer in other situations is one using CROSS APPLY
as illustrated in this blog post.
SELECT *
FROM t
CROSS APPLY (
SELECT TOP 10 u.*
FROM u
WHERE u.t_id = t.t_id
ORDER BY u.something DESC
) u
This query involves 2 tables. The OP's query only involves 1 table, in case of which a window function based solution might be more efficient.
Upvotes: 5
Reputation: 127
Tried the following and it worked with ties too.
SELECT rs.Field1,rs.Field2
FROM (
SELECT Field1,Field2, ROW_NUMBER()
OVER (Partition BY Section
ORDER BY RankCriteria DESC ) AS Rank
FROM table
) rs WHERE Rank <= 10
Upvotes: 9
Reputation: 343
You can try this approach. This query returns 10 most populated cities for each country.
SELECT city, country, population
FROM
(SELECT city, country, population,
@country_rank := IF(@current_country = country, @country_rank + 1, 1) AS country_rank,
@current_country := country
FROM cities
ORDER BY country, population DESC
) ranked
WHERE country_rank <= 10;
Upvotes: 1
Reputation: 142252
If you are using SQL 2005 you can do something like this...
SELECT rs.Field1,rs.Field2
FROM (
SELECT Field1,Field2, Rank()
over (Partition BY Section
ORDER BY RankCriteria DESC ) AS Rank
FROM table
) rs WHERE Rank <= 10
If your RankCriteria has ties then you may return more than 10 rows and Matt's solution may be better for you.
Upvotes: 278
Reputation: 61
Q) Finding TOP X records from each group(Oracle)
SQL> select * from emp e
2 where e.empno in (select d.empno from emp d
3 where d.deptno=e.deptno and rownum<3)
4 order by deptno
5 ;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7839 KING PRESIDENT 17-NOV-81 5000 10
7369 SMITH CLERK 7902 17-DEC-80 800 20
7566 JONES MANAGER 7839 02-APR-81 2975 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
6 rows selected.
Upvotes: 6
Reputation: 10517
If you want to produce output grouped by section, displaying only the top n records from each section something like this:
SECTION SUBSECTION
deer American Elk/Wapiti
deer Chinese Water Deer
dog Cocker Spaniel
dog German Shephard
horse Appaloosa
horse Morgan
...then the following should work pretty generically with all SQL databases. If you want the top 10, just change the 2 to a 10 toward the end of the query.
select
x1.section
, x1.subsection
from example x1
where
(
select count(*)
from example x2
where x2.section = x1.section
and x2.subsection <= x1.subsection
) <= 2
order by section, subsection;
To set up:
create table example ( id int, section varchar(25), subsection varchar(25) );
insert into example select 0, 'dog', 'Labrador Retriever';
insert into example select 1, 'deer', 'Whitetail';
insert into example select 2, 'horse', 'Morgan';
insert into example select 3, 'horse', 'Tarpan';
insert into example select 4, 'deer', 'Row';
insert into example select 5, 'horse', 'Appaloosa';
insert into example select 6, 'dog', 'German Shephard';
insert into example select 7, 'horse', 'Thoroughbred';
insert into example select 8, 'dog', 'Mutt';
insert into example select 9, 'horse', 'Welara Pony';
insert into example select 10, 'dog', 'Cocker Spaniel';
insert into example select 11, 'deer', 'American Elk/Wapiti';
insert into example select 12, 'horse', 'Shetland Pony';
insert into example select 13, 'deer', 'Chinese Water Deer';
insert into example select 14, 'deer', 'Fallow';
Upvotes: 6
Reputation: 1291
In T-SQL, I would do:
WITH TOPTEN AS (
SELECT *, ROW_NUMBER()
over (
PARTITION BY [group_by_field]
order by [prioritise_field]
) AS RowNo
FROM [table_name]
)
SELECT * FROM TOPTEN WHERE RowNo <= 10
Upvotes: 129
Reputation: 12174
I know this thread is a little bit old but I've just bumped into a similar problem (select the newest article from each category) and this is the solution I came up with :
WITH [TopCategoryArticles] AS (
SELECT
[ArticleID],
ROW_NUMBER() OVER (
PARTITION BY [ArticleCategoryID]
ORDER BY [ArticleDate] DESC
) AS [Order]
FROM [dbo].[Articles]
)
SELECT [Articles].*
FROM
[TopCategoryArticles] LEFT JOIN
[dbo].[Articles] ON
[TopCategoryArticles].[ArticleID] = [Articles].[ArticleID]
WHERE [TopCategoryArticles].[Order] = 1
This is very similar to Darrel's solution but overcomes the RANK problem that might return more rows than intended.
Upvotes: 9
Reputation: 562931
I do it this way:
SELECT a.* FROM articles AS a
LEFT JOIN articles AS a2
ON a.section = a2.section AND a.article_date <= a2.article_date
GROUP BY a.article_id
HAVING COUNT(*) <= 10;
update: This example of GROUP BY works in MySQL and SQLite only, because those databases are more permissive than standard SQL regarding GROUP BY. Most SQL implementations require that all columns in the select-list that aren't part of an aggregate expression are also in the GROUP BY.
Upvotes: 19
Reputation: 103565
If you know what the sections are, you can do:
select top 10 * from table where section=1
union
select top 10 * from table where section=2
union
select top 10 * from table where section=3
Upvotes: 15
Reputation: 204269
This works on SQL Server 2005 (edited to reflect your clarification):
select *
from Things t
where t.ThingID in (
select top 10 ThingID
from Things tt
where tt.Section = t.Section and tt.ThingDate = @Date
order by tt.DateEntered desc
)
and t.ThingDate = @Date
order by Section, DateEntered desc
Upvotes: 36