Reputation: 1599
I have table A
, containing employee IDs
and sequential date steps since their employment start date
(in this case 1x
employee who started on 1/1/1900
).
employee_id date
1xxxxxxx 1/1/1900
1xxxxxxx 4/1/1900
1xxxxxxx 7/1/1900
1xxxxxxx 10/1/1900
1xxxxxxx 1/1/1901
1xxxxxxx 4/1/1901
1xxxxxxx 7/1/1901
1xxxxxxx 10/1/1901
I also have table B, containing all ratings an employee ever received:
employee_id rating date
1xxxxxxx 2 1/1/1900
1xxxxxxx 3 6/1/1900
1xxxxxxx 4 12/31/1900
1xxxxxxx 4 8/2/1901
1xxxxxxx 5 10/1/1901
How would I join B
to A
to obtain the most current rating for the employee as of date A.date
? This is the desired output:
employee_id date rating
1xxxxxxx 1/1/1900 2
1xxxxxxx 4/1/1900 2
1xxxxxxx 7/1/1900 3
1xxxxxxx 10/1/1900 3
1xxxxxxx 1/1/1901 4
1xxxxxxx 4/1/1901 4
1xxxxxxx 7/1/1901 4
1xxxxxxx 10/1/1901 5
Upvotes: 1
Views: 2411
Reputation: 20730
(This is supplemental to @GordonLinoff answer)
Subquery decorrelation in Presto currently is able to to decorrelate a predicate when it's "near the top". It is not currently capable of decorrelating ORDER BY
or TopN
(ORDER BY .. LIMIT
) (issue). This can be worked around by using max_by
instead. This turns a subquery into aggregation query which Presto can decorrelate.
WITH a AS ( SELECT 1 employee_id, 2 date ),
b AS ( SELECT * FROM (VALUES (1, 2, 2), (1, 1, 1), (1, 3, 3)) t(employee_id, date, rating) )
select a.*,
(select min_by(b.rating, b.date)
from b
where b.employee_id = a.employee_id and
b.date <= a.date
) as rating
from a;
employee_id | date | rating
-------------+------+--------
1 | 2 | 1
(1 row)
Upvotes: 3
Reputation: 1269883
A simple way to express this uses a correlated subquery:
select a.*,
(select b.rating
from b
where b.employee_id = a.employee_id and
b.date <= a.date
order by b.date desc
limit 1
) as rating
from a;
If you need more than one column, then a more complicated query is needed. One approach is:
select ab.*
from (select a.*, b.*, -- list the columns you want
row_number() over (partition by a.employee_id, a.date order by b.date desc) as seqnum
from a left join
b
on b.employee_id = a.employee_id and
b.date <= a.date
) ab
where seqnum = 1;
Upvotes: 1