Reputation: 53
I am trying to query a table by taking the maximum values from two different date columns, and output's all the records that have maximum of both the dates
The table has 6 columns which include st_id(string)(there are multiple entries of the same id), as_of_dt(int) and ld_dt_ts(timestamp). From this table, I am trying to get the max value of as_of_dt and ld_dt_ts and group by st_id and display all the records.
SELECT A.st_id, A.fl_vw, A.tr_record FROM db.tablename A
INNER JOIN (
SELECT st_id, max(as_of_dt) AS as_of_dt, max(ld_dt_ts) AS ld_dt_ts
From db.tablename
group by st_id
) B on A.st_id = B.st_id and A.as_of_dt = B.as_of_dt and A.ld_dt_ts= B.ld_dt_ts
--
The expected result should return the st_id that has the maximum of both as_of_dt and ld_dt_ts i.e., which will be the latest record for each st_id.
Upvotes: 2
Views: 6280
Reputation: 38290
Use analytic rank()
function. rank()
will assign 1
to all records with max date in the st_id
partition:
SELECT s.st_id, s.fl_vw, s.tr_record
from
(
SELECT A.st_id, A.fl_vw, A.tr_record,
rank() over(partition by st_id order by as_of_dt desc) rnk_as_of_dt,
rank() over(partition by st_id order by ld_dt_ts desc) rnk_ld_dt_tsrnk
FROM db.tablename A
)s
WHERE rnk_as_of_dt=1 ANDrnk=1 rnk_ld_dt_ts=1 --get records with max dates in both columns
Two ranks can be possibly combined like this:
SELECT s.st_id, s.fl_vw, s.tr_record
from
(
SELECT A.st_id, A.fl_vw, A.tr_record,
rank() over(partition by st_id order by as_of_dt desc, ld_dt_ts desc) rnk
FROM db.tablename A
)s
WHERE rnk=1 --get records with max dates combination
but this is not exactly the same as your original query. For example if you have this dataset:
st_id, as_of_dt, ld_dt_ts
1 1 2
1 2 1
Then this query
SELECT st_id, max(as_of_dt) AS as_of_dt, max(ld_dt_ts) AS ld_dt_ts
From db.tablename
group by st_id
will return:
st_id, as_of_dt, ld_dt_ts
1 2 2
And final join will not return any rows, because no any row with such combination exist, while query with two ranks combined will return:
st_id, as_of_dt, ld_dt_ts
1 2 1
If such dataset can not exist in your data (say, ld_dt_ts is always>=as_of_dt), then you possibly can combine ranks into single one or even use only one date in the order by.
Upvotes: 2
Reputation: 53
--I am using row_number() function,--
row_number() over(PARTITION BY st_id ORDER BY as_of_dt DESC, ld_dt_ts DESC) RN from db.tablename s)tmp where RN=1
Upvotes: 1