Reputation: 23
given a table foo of the following structure (Oracle 11g):
ID | GROUP_ID
1 | 100
2 | 100
3 | 100
4 | 200
5 | 300
6 | 300
7 | 400
I want to select the first n rows (ordered by ID) or more, such that I always get a complete group.
Example:
n = 2: I want to get at least the first two rows, but since ID 3 also belongs to group 100, I want to get that as well.
n = 4: Give me the first four rows and I am happy ;-)
n = 5: Rows 1-6 are requested.
Your help is highly appreciated!
Upvotes: 2
Views: 2902
Reputation: 18410
Solution using rank()
:
select id, group_id
from (select t.*, rank() over (order by group_id) as rnk
from t)
where rnk <= :n;
Building test data:
SQL> create table t (id number not null primary key
2 , group_id number not null);
Table created.
SQL> insert into t values (1, 100);
1 row created.
SQL> insert into t values (2, 100);
1 row created.
SQL> insert into t values (3, 100);
1 row created.
SQL> insert into t values (4, 200);
1 row created.
SQL> insert into t values (5, 300);
1 row created.
SQL> insert into t values (6, 300);
1 row created.
SQL> insert into t values (7, 400);
1 row created.
SQL> commit;
Commit complete.
SQL>
Running...
SQL> var n number
SQL> exec :n := 2;
PL/SQL procedure successfully completed.
SQL> select id, group_id
2 from (select t.*, rank() over (order by group_id) as rnk
3 from t)
4 where rnk <= :n;
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
SQL> exec :n := 4;
PL/SQL procedure successfully completed.
SQL> select id, group_id
2 from (select t.*, rank() over (order by group_id) as rnk
3 from t)
4 where rnk <= :n;
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
4 200
SQL> exec :n := 5;
PL/SQL procedure successfully completed.
SQL> select id, group_id
2 from (select t.*, rank() over (order by group_id) as rnk
3 from t)
4 where rnk <= :n;
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
4 200
5 300
6 300
6 rows selected.
EDIT Here is version that includes the for update
clause (:n = 2):
SQL> select id, group_id
2 from T
3 where rowid in (select RID
4 from (select t.rowid as RID, t.*, rank() over (order by group_id) as rnk
5 from t)
6 where rnk <= :n)
7 for update;
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
Upvotes: 7
Reputation: 43533
If we assume that the group_id's are contiguous and ascending, then @Shannon's answer works perfectly. If we do not make that assumption, and we have data that looks like this, for example:
SQL> select * from foo order by id;
ID GROUP_ID
-- --------
1 100
2 100
3 100
4 200
6 100
7 400
9 500
10 500
11 500
12 600
Then it's a stickier problem. For example, if N = 3, 4, or 5, then we need to get the rows through ID = 6. For N = 6, we need up to ID = 7. For N = 7, we need through ID = 11.
I believe this query works regardless of the order of group_id:
For N = 7:
WITH q AS (SELECT ID, group_id
, row_number() OVER (ORDER BY ID) rn
, MAX(id) OVER (PARTITION BY group_id) rn2
FROM foo)
SELECT ID, group_id FROM q
WHERE ID <= (SELECT max(rn2) FROM q WHERE rn <= :N)
ORDER BY ID;
ID GROUP_ID
-- --------
1 100
2 100
3 100
4 200
6 100
7 400
9 500
10 500
11 500
9 rows selected
For N = 6:
ID GROUP_ID
-- --------
1 100
2 100
3 100
4 200
6 100
7 400
For N = 1:
ID GROUP_ID
-- --------
1 100
2 100
3 100
4 200
6 100
Upvotes: 0
Reputation: 86745
If your ID
s are always sequential (without gaps) from 1. And if your Group_ID
s never occur as a second group elsewhere. And if your Group_ID
s are always ascending in value...
SELECT
*
FROM
foo
WHERE
Group_ID <= (SELECT Group_ID FROM foo WHERE ID = n)
ORDER BY
ID
You'll benefit here from having separate indexes on ID
and Group_ID
Upvotes: 0
Reputation: 146239
If it is always true that GROUP_ID
is contiguous and ascending, then this is easily solved with SQL using an analytical ROW_NUMBER()
function:
SQL> select id
2 , group_id
3 from foo
4 where group_id <= ( select group_id
5 from (
6 select f.group_id
7 , row_number() over (order by f.id asc) rn
8 from foo f
9 )
10 where rn = &n )
11 order by id
12 /
Enter value for n: 2
old 10: where rn = &n )
new 10: where rn = 2 )
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
SQL> r
1 select id
2 , group_id
3 from foo
4 where group_id <= ( select group_id
5 from (
6 select f.group_id
7 , row_number() over (order by f.id asc) rn
8 from foo f
9 )
10 where rn = &n )
11* order by id
Enter value for n: 4
old 10: where rn = &n )
new 10: where rn = 4 )
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
4 200
SQL> r
1 select id
2 , group_id
3 from foo
4 where group_id <= ( select group_id
5 from (
6 select f.group_id
7 , row_number() over (order by f.id asc) rn
8 from foo f
9 )
10 where rn = &n )
11* order by id
Enter value for n: 5
old 10: where rn = &n )
new 10: where rn = 5 )
ID GROUP_ID
---------- ----------
1 100
2 100
3 100
4 200
5 300
6 300
6 rows selected.
SQL>
Upvotes: 0