Pandas - identify last entry of an item

Question

I have a dataframe that looks like:

df = pd.DataFrame({'ID': [4]*2 +[5]*4 , 'Timestamp' : pd.date_range('2000-01-01', periods=6)}) 

 ID  Timestamp 
0   4 2000-01-01 
1   4 2000-01-02  
2   5 2000-01-03  
3   5 2000-01-04  
4   5 2000-01-05  
5   5 2000-01-06

I want the last entry in the ID column to be identified (marked as 0). So the desired output should be something like:

 ID  Timestamp   last
0   4 2000-01-01  -
1   4 2000-01-02  0
2   5 2000-01-03  -
3   5 2000-01-04  -
4   5 2000-01-05  -
5   5 2000-01-06  0

How can I do this? I tried with the various options of df.duplicated('ID'), but with that I can drop the last entry, but not identify/mark the last entry

jezrael · Accepted Answer

Add parameter keep='last' for return mask with True for not last values and False for last values:

df['last'] = np.where(df.duplicated('ID', keep='last'), '-', '0') 
print (df)
   ID  Timestamp last
0   4 2000-01-01    -
1   4 2000-01-02    0
2   5 2000-01-03    -
3   5 2000-01-04    -
4   5 2000-01-05    -
5   5 2000-01-06    0

Detail:

print (df.duplicated('ID', keep='last'))
0     True
1    False
2     True
3     True
4     True
5    False
dtype: bool

Pandas - identify last entry of an item

Answers (2)

Related Questions