Reputation: 153
I have thousands of rows of data. Two of these columns are called Area and Sub Area. Each Area has about 4, 5 or more Sub Areas.
I want to select a random sample of records like a Top 10 for each Area and Sub Area but can't think how to do this efficiently or quickly.
I have considered doing a UNION
selecting each Area and Sub Area each time but i would have to do 120 UNION
statements this way as there are 120 rows of distinct Areas and Sub Areas.
The same issue prevails when I considered doing Case
statements in the SELECT
list.
I thought perhaps I could use a CTE
or WHILE LOOP
in some way but not having much luck. Could making use of Window functions be something to use here?
Any ideas?
Upvotes: 1
Views: 55
Reputation: 17943
You can try like following to pick random 10 records for each subarea.
select *
from (select *,
row_number()
over(
partition by subarea
order by newid()) RN
from yourtable) T
where rn <= 10
In above query order by newid()
will give you random records and partition BY subarea
will generate different rownumber for each subarea.
Upvotes: 5