krash
krash

Reputation: 25

Get the no of consecutive days a Field value is Stale

I wanted to get to the consecutive number of records a certain field value is stale based on the rates table. From the below data records 3,4,5 have the same rate as 0.770827 ,so the days the rate is stale is 3 and the prior rate before stale is 0.770886.I would like to get help in writing a query to the no of records that have stale rate and also get to the prior rate of the same as well.In the below sample i am only showing CAD to USD ,but we need the same across different currencies . Any assistance would be highly helpful.

enter image description here

Expected Output

enter image description here

Upvotes: 1

Views: 162

Answers (3)

GMB
GMB

Reputation: 222672

This is a gaps and islands problem.

You can use lag() to retrieve the previous rate for the same currencies tuple, and then do a window sum to define groups of consecutive records with the same rate. Then, you can aggregate the groups, and recover the previous rate using lag() again. The last step is to filter on groups that have at least 3 records.

select *
from (
    select
        from_cur,
        to_cur,
        rate,
        max(date) max_date,
        lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
        count(*) cnt
    from (
        select
            t.*,
            sum(case when rate = lag_rate then 0 else 1 end) over(partition by from_date, to_date order by date) grp 
        from (
            select 
                t.*,
                lag(rate) over(partition by from_cur, to_cur order by date) lag_rate
            from mytable t
        ) t
    ) t
    group by from_cur, to_cur, rate, grp
) t
where cnt >= 3
order by from_cur, to_cur, max_date

Actually, using the difference between row numbers can save one level of nesting:

select *
from (
    select
        from_cur,
        to_cur,
        rate,
        max(date) max_date,
        lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
        count(*) cnt
    from (
        select
            t.*,
            row_number() over(partition by from_cur, to_cur order by date) rn1,
            row_number() over(partition by from_cur, to_cur, rate order by date) rn2         
        from mytable t
    ) t
    group by from_cur, to_cur, rate, rn1 - rn2
) t
where cnt >= 3
order by from_cur, to_cur, max_date

If you want only the earliest record per currency tuple, then you can use row_number():

select *
from (
    select
        from_cur,
        to_cur,
        rate,
        max(date) max_date,
        lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
        count(*) cnt,
        row_number() over(partition by from_cur, to_cur, case when count(*) >= 3 then 0 else 1 end order by max(date)) rn
    from (
        select
            t.*,
            row_number() over(partition by from_cur, to_cur order by date) rn1,
            row_number() over(partition by from_cur, to_cur, rate order by date) rn2         
        from mytable t
    ) t
    group by from_cur, to_cur, rate, rn1 - rn2
) t
where cnt >= 3 and rn = 1
order by from_cur, to_cur

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271003

This is a gap-and-islands problem, but I would solve it just by subtracting a sequence from the date. And then aggregating:

select to_cur, from_cur, rate, min(date), max(date),
       count(*) as days_stale
from (select r.*,
             row_number() over (partition by to_cur, from_cur, rate order by date) as seqnum
      from rates r
     ) r
group by (date - seqnum * interval '1' day)

Upvotes: 0

Ponder Stibbons
Ponder Stibbons

Reputation: 14858

When value changes mark row with 1, otherwise 0. Then sum this column (flg), you have now consecutive groups (grp). Use grp to aggregate, count, show min and max dates:

dbfiddle demo

select to_cur, from_cur, min(dt) dt_from, max(dt) dt_to, rate, count(1) cnt
  from (
    select dt, to_cur, from_cur, rate, 
           sum(flg) over (partition by to_cur, from_cur order by dt) grp
      from (
        select dt, to_cur, from_cur, rate, 
               case lag(rate) over (partition by to_cur, from_cur order by dt) 
                    when rate then 0 else 1 end flg
          from t))
  group by grp, to_cur, from_cur, rate
  order by from_cur, to_cur, min(dt)

If you want any specific group after group by add:

  having count(1) >= 3

Upvotes: 2

Related Questions