siwei
siwei

Reputation: 83

SQL how to select n row from each interval of one column

For example, the table looks like

a b c
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 1 1
7 1 1
8 1 1
9 1 1
10 1 1
11 1 1

I want to randomly pick 2 rows from every interval based on column a, where a ~ [0, 2], a ~ [4, 6], a ~ [9-20].

Another more complicated case would be select n rows from every interval based on multiple columns, for example in this case the interval will be a ~ [0, 2], a ~ [4, 6], b ~ [7, 9], ...

Is there a way to do so with just SQL?

Upvotes: 2

Views: 253

Answers (2)

Felipe Hoffa
Felipe Hoffa

Reputation: 59165

Find out to which interval each row belongs, order by random partitioned by an interval id, get the top n rows for each interval:

create transient table mytable as
select seq8() id, random() data
from table(generator(rowcount => 100)) v;

create transient table intervals as
select 0 i_start, 6 i_end, 2 random_rows
union all select 7, 20, 1
union all select 21, 30, 3
union all select 31, 50, 1;

select *
from (
    select *
      , row_number() over(partition by i_start order by random()) rn
    from mytable a
    join intervals b
    on a.id between b.i_start and b.i_end
)
where rn<=random_rows

enter image description here

Edit: Shorter and cleaner.

select a.*
from mytable a
join intervals b
on a.id between b.i_start and b.i_end
qualify row_number() over(partition by i_start order by random()) <= random_rows

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269443

To get two rows per group, you want to use row_number(). To define the groups, you can use a lateral join to define the groupings:

select t.*
from (select t.*,
             row_number() over (partition by v.grp order by random()) as seqnum
      from t cross join lateral
           (values (case when a between 0 and 2 then 1
                         when a between 4 and 6 then 2
                         when a between 7 and 9 then d
                    end)
           ) v(grp)
      where grp is not null
     ) t
where seqnum <= 2;

You can adjust the case expression to define whatever groups you like.

Upvotes: 0

Related Questions