LuFaMa
LuFaMa

Reputation: 346

SQL - How to select first n rows based on row values

My title is probably not very clear; hopefully I can make more sense here.

I'm working with an Oracle DB.

I have a table called EMPLOYEES

There's a cloumn in the table called DIVISION

Every employee belongs to a given division.

I want to select the first 5 employees per division.

For example if my table looks like:

EMPLOYEE_NO |   DIVISION
----------------------
Mike        |   1
John        |   1
Peter       |   3
Paul        |   2
Mary        |   1
Joanne      |   2
Kristine    |   3
Adam        |   1
Brian       |   3
Joel        |   3
Amy         |   2
Ben         |   2
Ryan        |   1

I want to be able to query the table and get 3 employees from each division. So my result will be something like:

EMPLOYEE_NO |   DIVISION
----------------------
Mike        |   1
John        |   1
Mary        |   1
Joanne      |   2
Amy         |   2
Ben         |   2
Kristine    |   3
Brian       |   3
Joel        |   3

I don't care if it's the first 3 members, or if it's the last 3, or if it's a random selection of 3. Essentially I want a sampling of employees from each division.

I found a similar question already: How to select the first N rows of each group?

But my concern is if this solution wouldn't be a good fit for a DB with millions of records. I'm working with about 3 million reocrds.

Upvotes: 0

Views: 827

Answers (1)

Alex Poole
Alex Poole

Reputation: 191255

You could use an analytic function in a subquery:

select employee_no, division
from (
  select employee_no, division,
    row_number() over (partition by division) as rn
)
where rn <= 3
order by division, employee_no;

You could also use rank or dense_rank. Normally you'd use an order by clause in the window. Without one the rows you get back are non-deterministic, which you want (though you'd probably get the same result from running the query several times), but if you do want a more random sample then you could order by a random value:

  row_number() over (partition by division order by dbms_random.value) as rn

Upvotes: 1

Related Questions