Reputation: 1105
So I have a df like this:
ID fruit
001 grapes
002 apples
002 mangos
003 bananas
004 oranges
004 grapes
And I want to join the following onto it:
ID store_time
001 2021-04-02 03:02:00.321
002 2021-04-02 02:02:00.319
002 2021-04-03 12:02:00.319
002 2021-04-04 13:02:00.312
003 2021-04-02 19:02:00.313
004 2021-04-02 15:02:00.122
004 2021-04-01 11:02:00.121
So all I want to do is join based on just the most recent timestamp. So leave the others behind and have only the number of rows as there are in the fruit df.
Final output:
ID fruit timestamp
001 grapes 2021-04-02 03:02:00.321
002 apples 2021-04-04 13:02:00.312
002 mangos 2021-04-04 13:02:00.312
003 bananas 2021-04-02 19:02:00.313
004 oranges 2021-04-02 15:02:00.122
004 grapes 2021-04-02 15:02:00.122
Upvotes: 1
Views: 669
Reputation: 133370
you need a subquery for max tme stamp
select a.id, a.fruit, b.max_time
from my_table_fruit a
inner join (
select id, max(store_time) max_time
from my_table_time
) b on b.id = a.id
Upvotes: 0
Reputation: 164099
Aggregate in the 2nd table to get the most recent store_time
for each ID
and then join to the 1st table:
SELECT t1.ID, t1.fruit, t2.timestamp
FROM table1 t1
LEFT JOIN (
SELECT ID, MAX(store_time) timestamp
FROM table2
GROUP BY ID
) t2 ON t2.ID = t1.ID
I used a LEFT
join just in case table2
does not contain all the ID
s of table1
.
If this is not the case then you can change it to an INNER
join.
Upvotes: 1