user131983
user131983

Reputation: 3937

Getting the most recent data for each value in another column

I am trying to get the data associated with the most recent curve_date corresponding to each tenor_years value and am using the query below to do this. However, I am not getting the data as I would like it to be.

SELECT
  tenor_years,
  yield_pct,
  MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years, yield_pct
order by tenor_years

SELECT * FROM yc_node_hist where fk_yc_update = 12 gives the data below:

id      fk_yc_update curve_date  tenor_years    yield_pct     
353443  12           2013-07-26       1           0.1436
353444  12           2013-07-29       1           0.1389
353445  12           2013-07-30       1           0.133

The data comes out as follows:

tenor_years   yield_pct   curve_date
1              0.0828     2014-05-14
1              0.0832     2014-05-19

I want to get something like:

tenor_years   yield_pct   curve_date
1              0.0828     2014-05-14
2              0.3232     2015-06-17
..
30

Thank You

Upvotes: 1

Views: 43

Answers (4)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726579

SQL Server offers PARTITION/OVER functionality for situations like that.

SELECT tenor_years,
        yield_pct,
        MostRecentDate
 FROM (
    SELECT
        tenor_years,
        yield_pct,
        curve_date AS "MostRecentDate",
        RANK() OVER (PARTITION BY tenor_years ORDER BY curve_date DESC) N
    FROM yc_node_hist
    where fk_yc_update = 12
)M
WHERE N = 1
ORDER BY tenor_years

This produces a fast query with a projection, avoiding the need to join back to the original.

Demo.

Upvotes: 1

ggornic
ggornic

Reputation: 109

SELECT tenor_years,
       yield_pct,
       curve_date
FROM (
        SELECT tenor_years,
               yield_pct,
               curve_date,
               MAX(curve_date) OVER (PARTITION BY tenor_years) max_date
        FROM yc_node_hist
        WHERE fk_yc_update = 12
      ) [anon]
WHERE curve_date = max_date
ORDER BY tenor_years;

Upvotes: 0

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

with x as
(SELECT
tenor_years,
--yield_pct,
MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years --, yield_pct
 )
select x.tenor_years, y.yield_pct, y.mostrecentdate
from x join yc_node_hist y on x.mostrecentdate = y.curvedate

CTE is one way to achieve it and the grouping on yield_pct has to be removed.

Upvotes: 0

Norbert
Norbert

Reputation: 6084

You have to remove the yield_pct from the group by:

SELECT
  tenor_years,
  MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years;

And then join back on itself:

SELECT a.tenor_years, a.curve_date,a.yield_pct
FROM yc_node_hist a
INNER JOIN (
  SELECT
    tenor_years,
    MAX(curve_date) AS "MostRecentDate"
  FROM yc_node_hist
  where fk_yc_update = 12) b 
ON a.tenor_years=b.tenor_years AND a.curve_date=b.MostRecentDate
ORDER BY tenor_years ASC;
GROUP BY tenor_years;

Upvotes: 1

Related Questions