Andy Mitchell
Andy Mitchell

Reputation: 77

Obtaining the value the last value in a series to match certain criteria in pandas

I have a dataframe with raw data, including pertinent data shown below, which is ordered by timestamp.

df1

   Line   Timestamp  Depot  Product
0  Line1    08:00       1     P1
1  Line1    08:10       1     P1
2  Line2    08:15       1     P4
3  Line1    08:20       1     P1
4  Line3    08:23       2     P13
5  Line1    08:30       2     P1

I have a summary tables for each production line derived from this, with pertinent data as shown below:

df2

   Line   Product  Depot  Time_Complete
0  Line 1   P1       1      NaN
1  Line 1   P1       2      NaN
2  Line 1   P2       1      NaN
3  Line 1   P2       1      NaN

What I'm trying to do is take the last timestamp from the first dataframe matching particular criteria. In this case if Line 1, P1, Depot 1 has the last time stamp would be 08:20. I'm not 100% sure where to start with this type of wrangling

Upvotes: 1

Views: 40

Answers (1)

jezrael
jezrael

Reputation: 862681

I think need drop_duplicates by specified columns and then merge with left join:

df11 = df1.drop_duplicates(['Line','Depot','Product'], keep='last')
df = df2.merge(df11, on=['Line','Depot','Product'], how='left')
print (df)
    Line Product  Depot  Time_Complete Datestamp
0  Line1      P1      1            NaN     08:20
1  Line1      P1      2            NaN     08:30
2  Line1      P2      1            NaN       NaN
3  Line1      P2      1            NaN       NaN

Upvotes: 1

Related Questions