Reputation: 11
I need to select a random sample using TSQL from a table based on ratios of 2 different variables in the table.
The random sample required is approximately 8000 records from a table with about 381,000 records. The random sample must have approximate ratios of 2 variables:
4:1 (Male/Female) - 2 category variable 4:3:2:1 (Heavy/Moderate/Light/Very Light) - 4 category variable
Upvotes: 0
Views: 79
Reputation: 45106
Break it down to how many of each
select top (640) *
from table
where sex = 'f'
and cat = 'heavy'
order by NewID()
union all
select top (480) *
from table
where sex = 'f'
and cat = 'medium'
order by NewID()
...
4 + 1 = 5
4 + 3 + 2 + 1 = 10
640 = 8000 / 5 * 4 / 10
Upvotes: 3