CandleWax
CandleWax

Reputation: 2219

How to compare items in a group with pandas return boolean?

The first row of every group 'ID' has an end_date, and I am trying to identify any rows within the group, that have a begin_date equal to the end_date of the first row. I need to return the type for the row that matches the dates. If there are multiple matches, the first is sufficient. If there are no matches then return 'non-existent'.

df

ID    color  begin_date    end_date     type
1     red    2017-01-01    2017-01-07   Professional
1     green  2017-01-05    2017-01-07   Aquatic
1     blue   2017-01-07    2017-01-15   Superhero
1     red    2017-01-11    2017-01-22   Chocolate
2     red    2017-02-22    2017-02-26   Professional
2     green  2017-02-26    2017-02-28   Aquatic
2     blue   2017-02-26    2017-02-28   Superhero
2     red    2017-02-27    2017-02-28   Chocolate
3     red    2017-03-11    2017-03-22   Chocolate


if df.groupby('ID')['begin_date'].first() == df.groupby('ID')['end_date'].any():
    return df.groupby('ID')['end_date'].any().to_dict()
else: 
    return 'non-existent'

final df

ID    type     
1     Superhero
2     Aquatic
3     non-existant

Upvotes: 1

Views: 84

Answers (2)

sacuL
sacuL

Reputation: 51335

Here is another way, using groupby(), nth() and reindex():

df.groupby('ID').apply(lambda x: x.loc[x.begin_date.eq(x.end_date.iloc[0]), 'type']).groupby('ID').nth(0).reindex(df['ID'].unique(),fill_value='non existant')

ID
1       Superhero
2         Aquatic
3    non existant

Upvotes: 1

BENY
BENY

Reputation: 323226

IIUC

df.groupby('ID').apply(lambda x :  df.loc[x['begin_date'].isin(x['end_date'].iloc[[0]]).idxmax(),'type'] if x['begin_date'].isin(x['end_date'].iloc[[0]]).any() else 'non-existent')
Out[23]: 
ID
1       Superhero
2         Aquatic
3    non-existent
dtype: object

Upvotes: 2

Related Questions