SSingh
SSingh

Reputation: 153

How to Obtain Random Samples of Particular Groupings of Data

I have thousands of rows of data. Two of these columns are called Area and Sub Area. Each Area has about 4, 5 or more Sub Areas.

I want to select a random sample of records like a Top 10 for each Area and Sub Area but can't think how to do this efficiently or quickly.

I have considered doing a UNION selecting each Area and Sub Area each time but i would have to do 120 UNION statements this way as there are 120 rows of distinct Areas and Sub Areas.

The same issue prevails when I considered doing Case statements in the SELECT list.

I thought perhaps I could use a CTE or WHILE LOOP in some way but not having much luck. Could making use of Window functions be something to use here?

Any ideas?

Upvotes: 1

Views: 55

Answers (1)

PSK
PSK

Reputation: 17943

You can try like following to pick random 10 records for each subarea.

select * 
from   (select *, 
               row_number() 
                 over( 
                   partition by subarea 
                   order by newid()) RN 
        from   yourtable) T 
where  rn <= 10 

In above query order by newid() will give you random records and partition BY subarea will generate different rownumber for each subarea.

Upvotes: 5

Related Questions