Will
Will

Reputation: 11

TSQL random sample

I need to select a random sample using TSQL from a table based on ratios of 2 different variables in the table.

The random sample required is approximately 8000 records from a table with about 381,000 records. The random sample must have approximate ratios of 2 variables:

4:1 (Male/Female) - 2 category variable 4:3:2:1 (Heavy/Moderate/Light/Very Light) - 4 category variable

Upvotes: 0

Views: 79

Answers (1)

paparazzo
paparazzo

Reputation: 45106

Break it down to how many of each

select top (640) * 
from table 
where sex = 'f' 
  and cat = 'heavy' 
order by NewID() 
union all
select top (480) * 
from table 
where sex = 'f' 
  and cat = 'medium' 
order by NewID()  
...

4 + 1 = 5
4 + 3 + 2 + 1 = 10
640 = 8000 / 5 * 4 / 10

Upvotes: 3

Related Questions