Reputation: 346
My title is probably not very clear; hopefully I can make more sense here.
I'm working with an Oracle DB.
I have a table called EMPLOYEES
There's a cloumn in the table called DIVISION
Every employee belongs to a given division.
I want to select the first 5 employees per division.
For example if my table looks like:
EMPLOYEE_NO | DIVISION
----------------------
Mike | 1
John | 1
Peter | 3
Paul | 2
Mary | 1
Joanne | 2
Kristine | 3
Adam | 1
Brian | 3
Joel | 3
Amy | 2
Ben | 2
Ryan | 1
I want to be able to query the table and get 3 employees from each division. So my result will be something like:
EMPLOYEE_NO | DIVISION
----------------------
Mike | 1
John | 1
Mary | 1
Joanne | 2
Amy | 2
Ben | 2
Kristine | 3
Brian | 3
Joel | 3
I don't care if it's the first 3 members, or if it's the last 3, or if it's a random selection of 3. Essentially I want a sampling of employees from each division.
I found a similar question already: How to select the first N rows of each group?
But my concern is if this solution wouldn't be a good fit for a DB with millions of records. I'm working with about 3 million reocrds.
Upvotes: 0
Views: 827
Reputation: 191255
You could use an analytic function in a subquery:
select employee_no, division
from (
select employee_no, division,
row_number() over (partition by division) as rn
)
where rn <= 3
order by division, employee_no;
You could also use rank
or dense_rank
. Normally you'd use an order by
clause in the window. Without one the rows you get back are non-deterministic, which you want (though you'd probably get the same result from running the query several times), but if you do want a more random sample then you could order by a random value:
row_number() over (partition by division order by dbms_random.value) as rn
Upvotes: 1