Anurag Rawat
Anurag Rawat

Reputation: 485

pandas create a new column with condition (max date and == condition on existing columns)

I have a dataframe with following data:

id|invoice_no|invoice_date|change_previous_month|change_status

984974|110|2016-12-31|0|A

984974|8202|2017-01-30|-64864|D

115677|5505|2016-12-31|0|A

115677|5635|2017-01-30|58730|U

i want to create a new column 'event_date' from 'invoice_data' based on conditions on existing columns. the conditions are:

1) select maximum 'invoice_date' that is defined as the latest date and

2) select 'change_status' where 'change_status' == 'A' or 'change_status' == 'U'

The resulted dataframe should look like this:

id|invoice_no|invoice_date|change_previous_month|change_status|event_date

984974|110|2016-12-31|0|A|2016-12-31

984974|8202|2017-01-30|-64864|D|2016-12-31

115677|5505|2016-11-31|0|A|2017-01-30

115677|5635|2017-01-30|58730|U|2017-01-30

the event_date should be created from the invoice_date with satisfying the above two conditions ,thanks in advance for your help.

Upvotes: 1

Views: 542

Answers (1)

jezrael
jezrael

Reputation: 862801

I think you need:


s = (df[df['change_status'].isin(['A','U'])]
                            .sort_values('invoice_date')
                            .drop_duplicates('id', keep='last')
                            .set_index('id')['invoice_date'])

df['event_date'] = df['id'].map(s)
print (df)
       id  invoice_no invoice_date  change_previous_month change_status  \
0  984974         110   2016-12-31                      0             A   
1  984974        8202   2017-01-30                 -64864             D   
2  115677        5505   2016-12-31                      0             A   
3  115677        5635   2017-01-30                  58730             U   

  event_date  
0 2016-12-31  
1 2016-12-31  
2 2017-01-30  
3 2017-01-30  

Upvotes: 2

Related Questions