Reputation: 25
I wanted to get to the consecutive number of records a certain field value is stale based on the rates table. From the below data records 3,4,5 have the same rate as 0.770827 ,so the days the rate is stale is 3 and the prior rate before stale is 0.770886.I would like to get help in writing a query to the no of records that have stale rate and also get to the prior rate of the same as well.In the below sample i am only showing CAD to USD ,but we need the same across different currencies . Any assistance would be highly helpful.
Expected Output
Upvotes: 1
Views: 162
Reputation: 222672
This is a gaps and islands problem.
You can use lag()
to retrieve the previous rate
for the same currencies tuple, and then do a window sum to define groups of consecutive records with the same rate. Then, you can aggregate the groups, and recover the previous rate using lag()
again. The last step is to filter on groups that have at least 3 records.
select *
from (
select
from_cur,
to_cur,
rate,
max(date) max_date,
lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
count(*) cnt
from (
select
t.*,
sum(case when rate = lag_rate then 0 else 1 end) over(partition by from_date, to_date order by date) grp
from (
select
t.*,
lag(rate) over(partition by from_cur, to_cur order by date) lag_rate
from mytable t
) t
) t
group by from_cur, to_cur, rate, grp
) t
where cnt >= 3
order by from_cur, to_cur, max_date
Actually, using the difference between row numbers can save one level of nesting:
select *
from (
select
from_cur,
to_cur,
rate,
max(date) max_date,
lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
count(*) cnt
from (
select
t.*,
row_number() over(partition by from_cur, to_cur order by date) rn1,
row_number() over(partition by from_cur, to_cur, rate order by date) rn2
from mytable t
) t
group by from_cur, to_cur, rate, rn1 - rn2
) t
where cnt >= 3
order by from_cur, to_cur, max_date
If you want only the earliest record per currency tuple, then you can use row_number()
:
select *
from (
select
from_cur,
to_cur,
rate,
max(date) max_date,
lag(rate) over(partition by from_cur, to_cur order by max(date)) lag_rate_grp,
count(*) cnt,
row_number() over(partition by from_cur, to_cur, case when count(*) >= 3 then 0 else 1 end order by max(date)) rn
from (
select
t.*,
row_number() over(partition by from_cur, to_cur order by date) rn1,
row_number() over(partition by from_cur, to_cur, rate order by date) rn2
from mytable t
) t
group by from_cur, to_cur, rate, rn1 - rn2
) t
where cnt >= 3 and rn = 1
order by from_cur, to_cur
Upvotes: 2
Reputation: 1271003
This is a gap-and-islands problem, but I would solve it just by subtracting a sequence from the date. And then aggregating:
select to_cur, from_cur, rate, min(date), max(date),
count(*) as days_stale
from (select r.*,
row_number() over (partition by to_cur, from_cur, rate order by date) as seqnum
from rates r
) r
group by (date - seqnum * interval '1' day)
Upvotes: 0
Reputation: 14858
When value changes mark row with 1, otherwise 0. Then sum this column (flg
), you have now consecutive groups (grp
). Use grp
to aggregate, count, show min and max dates:
select to_cur, from_cur, min(dt) dt_from, max(dt) dt_to, rate, count(1) cnt
from (
select dt, to_cur, from_cur, rate,
sum(flg) over (partition by to_cur, from_cur order by dt) grp
from (
select dt, to_cur, from_cur, rate,
case lag(rate) over (partition by to_cur, from_cur order by dt)
when rate then 0 else 1 end flg
from t))
group by grp, to_cur, from_cur, rate
order by from_cur, to_cur, min(dt)
If you want any specific group after group by add:
having count(1) >= 3
Upvotes: 2