Reputation: 7693
I have a dataframe like as shown below
df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
'date_2': ['07/23/2017','05/29/2017','02/03/2015',np.nan,np.nan]})
df1 = df1.melt('person_id', value_name='dates')
df1['dates'] = pd.to_datetime(df1['dates'])
df1 = df1.assign(pd= (df1['dates'].dt.dayofyear - 1),
nd=((df1['dates'] + pd.offsets.YearEnd(1)) - df1['dates']).dt.days)
With the help of this post, I am able to achieve part of what I wanted to.
Now what I would like to do is
a) Shift the dates backward (subtract) based on minimum of pd value
for each subject
b) Shift the dates forward (add) based on minimum of nd value
for each subject
c) Check whether the year
component remains the same between 3 columns dates
, shift_forward
and shift_backward
So, I got the minimum of pd and nd values
using the code below
min_pd = df1.groupby(['person_id'])['pd'].min()
min_nd = df1.groupby(['person_id'])['nd'].min()
year_change = df1.dates.dt.year.eq(df1.shift_backward.dt.year(df1.shift_forward.dt.year))
but not sure how can I use this min_pd
and min_nd
as date offset for each subject.
I expect my output to be like as shown below
Upvotes: 1
Views: 54
Reputation: 75080
you can first do a transform
min with groupby and then your calculation would be easy with pd.to_timedelta
:
#get groupby min transformed to the length of the dataframe
min_pd_nd = df1.groupby('person_id')['pd','nd'].transform('min')
df1['Shift_backward'] = df1['dates'].sub(pd.to_timedelta(min_pd_nd['pd'],unit='d'))
df1['Shift_Forward'] = df1['dates'].add(pd.to_timedelta(min_pd_nd['nd'],unit='d'))
#check if in a given row all year are unique
c = (df1[['dates','Shift_backward','Shift_Forward']].stack(dropna=False).dt.year
.groupby(level=0).nunique())
df1['year_change'] = np.where(c.gt(1),'Yes','No')
print(df1[['person_id','dates','Shift_backward','Shift_Forward','year_change']])
person_id dates Shift_backward Shift_Forward year_change
0 11 1961-12-30 1961-06-10 1961-12-31 No
1 21 1967-05-29 1967-01-01 1967-12-31 No
2 31 1957-02-03 1957-01-01 1957-12-31 No
3 41 1959-07-27 1959-01-01 1959-12-31 No
4 51 1971-01-13 1971-01-01 1971-12-31 No
5 11 2017-07-23 2017-01-01 2017-07-24 No
6 21 2017-05-29 2017-01-01 2017-12-31 No
7 31 2015-02-03 2015-01-01 2015-12-31 No
8 41 NaT NaT NaT No
9 51 NaT NaT NaT No
Upvotes: 1