Corovei Andrei
Corovei Andrei

Reputation: 1668

SQL server query with top and group by

Given a table of the following structure:

subscriber_id,band

11,1

12,1

13,1

...

21,2

22,2

23,2

24,2

...

n1,n

n2,n

n3,n

...

nm,n

I want to get a subgroup of n% size of the subscribers from each group. For 10%, I should get 10% of group 1, 10% of group 2... 10% of group n.

Upvotes: 1

Views: 57

Answers (2)

Azar
Azar

Reputation: 1867

Try This

Select * from 
(
Select *, NTILE(n%) over(partition by id order by id) 'R' from t)t
where t.R<=(n%)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269703

It sounds like you want a stratified sample. You can get this by first enumerating within each group and then choosing the "n" records you want. Here is an example of how to do this in SQL Server:

select t.id, t.band
from (select t.*,
             row_number() over (order by band_seqnum) as seqnum
      from (select t.*,
                   row_number() over (partition by band order by rand(checksum()) as band_seqnum,
                   count(*) over () as cnt
            from t
           ) t
     ) t
where band_seqnum <= 0.10 * cnt;

Upvotes: 1

Related Questions