Reputation:
I'm trying to update date values based on multiple conditions and my current code doesn't change the date values at all. My data looks like this:
ID StartDate EndDate Type Prev_EndDate Prev_ID
0 12781 2017-09-25 2017-11-09 A NaN NaN
1 12781 2017-11-10 2017-12-08 A 2017-11-09 12781
2 12781 2017-11-10 2018-07-06 B 2017-12-08 12781
3 12781 2018-07-07 2018-09-14 B 2018-07-06 12781
If you look at row 1 and row 2, there is an overlap in the timeline and I want to update the StartDate value to get rid of the overlap. But I want to update the start date only when the Type is "B" and the time range is overlapping for the same ID. That's why I put extra conditions such as df['ID'] == df['Prev_ID']
My current code is here:
if ((df['ID'] == df['Prev_ID']).all().all() and (df['StartDate'] <= df['prev_EndDate']).all().all() and (df['Position Type'] == 'B').all().all()):
df['StartDate'] = pd.DatetimeIndex(df['prev_EndDate']) + pd.DateOffset(1)
The data type of Start Date, End Date, and Prev_EndDate is datetime but my code is not making any changes. I'm not sure about my boolean options (e.g. all().all()) and wonder if that's why my code is not working. How do I make this work?
Upvotes: 0
Views: 48
Reputation: 115
I think this can be resolved using .loc
df.loc[ condition, 'StartDate']=pd.DatetimeIndex(df['prev_EndDate']) + pd.DateOffset(1)
Upvotes: 1