Sergejs Tretjakovs
Sergejs Tretjakovs

Reputation: 15

SQL get rank of ordered data

I have a data set that looks like this (ordered by date):

date value first_id second_id
2020-01-01 10 1 1
2020-01-02 15 1 1
2020-01-03 5 1 2
2020-01-04 75 2 2
2020-01-05 101 2 2
2020-01-06 12 1 1
2020-01-07 5 1 1
2020-01-08 14 1 2

I need to get an aggregation when values are the same for the same first_id and second_id in a sequence, lets say max(value), so I can get:

max_value first_id second_id
15 1 1
5 1 2
101 2 2
12 1 1
14 1 2

If you do max(value) and group by, same first_id and second_id combinations will give just one row (regardless of date ordering). I was thinking to add RANK when one of ids changes, e.g:

date value first_id second_id rank
2020-01-01 10 1 1 1
2020-01-02 15 1 1 1
2020-01-03 5 1 2 2
2020-01-04 75 2 2 3
2020-01-05 101 2 2 3
2020-01-06 12 1 1 4
2020-01-07 5 1 1 4
2020-01-08 14 1 2 5

But I don't know how to get that rank as well since same id combinations are considered together.

Upvotes: 1

Views: 61

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

You can use lag() and a cumulative sum to define the groups and then aggregate. You can see the groups if you run this query:

select t.*,
       sum(case when prev_date = prev_date2 then 0 else 1 end) over (order by date) as grp
from (select t.*,
             lag(date) over (order by date) as prev_date,
             lag(date) over (partition by first_id, second_id order by date) as prev_date2
      from t
     ) t;

The logic is saying that a new group starts when the previous date does not have the same values of the two id columns.

Then the aggregation is:

with grps as (
      select t.*,
             sum(case when prev_date = prev_date2 then 0 else 1 end) over (order by date) as grp
      from (select t.*,
                   lag(date) over (order by date) as prev_date,
                   lag(date) over (partition by first_id, second_id order by date) as prev_date2
            from t
           ) t
      )
select first_id, second_id, max(value), min(date), max(date)
from grps
group by grp

Upvotes: 1

Related Questions