Reputation: 7458
I have the following df
,
code date1 date2
2000 2018-03-21 2018-04-04
2000 2018-03-22 2018-04-05
2000 2018-03-23 2018-04-06
When I tried
df_code_grp_by = df.groupby(['code'])
df_code_grp_by.apply(lambda x: x.date2 - x.date1).dt.days.sum(level=0).reset_index(name='date_diff_sum')
I got
AttributeError: 'DataFrame' object has no attribute 'dt'
date1
and date2
are both dtype('<M8[ns]')
, I am wondering how to fix it.
I am using Pandas 0.22.0
, Python 3.5.2
and Numpy 1.15.4
.
Upvotes: 4
Views: 7278
Reputation: 863791
Better here is create index by code
column and subtract Series
:
df = df.set_index('code')
df = (df.date2 - df.date1).dt.days.sum(level=0).reset_index(name='date_diff_sum')
print (df)
code date_diff_sum
0 2000 42
Problem of code is apply
return rows (maybe bug):
df_code_grp_by = df.groupby(['code'])
df = df_code_grp_by.apply(lambda x: x.date2 - x.date1)
print (df)
0 1 2
code
2000 1209600000000000 1209600000000000 1209600000000000
Possible solution is use np.sum
:
df = (df_code_grp_by.apply(lambda x: np.sum(x.date2 - x.date1))
.dt.days
.reset_index(name='date_diff_sum'))
print (df)
code date_diff_sum
0 2000 42
Upvotes: 2