Reputation: 5097
I have a data-frame (currentQuarter
) which looks like
CODE SIP ... STATUS STATUS_CHANGE_DATE
0 AA YES ... CURRENT 2019-01-01
1 AA YES ... LEAVER 2019-02-05
2 AA YES ... LEAVER 2019-09-03
3 AA YES ... JOINER 2019-01-05
4 AA YES ... CURRENT 2019-03-08
I am trying to change the STATUS
of any record to CURRENT
where the STATUS
is not CURRENT
and the STATUS_CHANGE_DATE
is in the future. So in this example the referencePeriodStartDate
is 2019-07-25. Which would mean the third records STATUS
field would change to CURRENT
:
CODE SIP ... STATUS STATUS_CHANGE_DATE
0 AA YES ... CURRENT 2019-01-01
1 AA YES ... LEAVER 2019-02-05
2 AA YES ... CURRENT 2019-09-03
3 AA YES ... JOINER 2019-01-05
4 AA YES ... CURRENT 2019-03-08
I can filter the df
to create the query to find the records however I can't get the where
statement to change the affected records:
referencePeriodStartDate = datetime.datetime.strptime(referenceDate , '%d/%m/%Y')
not_current = currentQuarter['STATUS'].isin(leaverList)
currentQuarter['STATUS_CHANGE_DATE'] = pd.to_datetime(currentQuarter['STATUS_CHANGE_DATE'], errors='coerce')
currentQuarter['STATUS_CHANGE_DATE'].dt.strftime('%d/%m/%Y')
referenceError = currentQuarter['STATUS_CHANGE_DATE'] > referencePeriodStartDate
currentQuarter['STATUS'] = np.where(not_current & referenceError, "CURRENT", currentQuarter['STATUS'])
Upvotes: 0
Views: 38
Reputation: 88226
You could check which dates in STATUS_CHANGE_DATE
are greater than today's, and use the result to index the dataframe and set the STATUS
to CURRRENT
accordingly:
df['STATUS_CHANGE_DATE'] = pd.to_datetime(df.STATUS_CHANGE_DATE)
df.loc[df.STATUS_CHANGE_DATE.gt(referencePeriodStartDate), 'STATUS'] = 'CURRENT'
CODE SIP STATUS STATUS_CHANGE_DATE
0 AA YES CURRENT 2019-01-01
1 AA YES LEAVER 2019-02-05
2 AA YES CURRENT 2019-09-03
3 AA YES JOINER 2019-01-05
4 AA YES CURRENT 2019-03-08
Upvotes: 2