Reputation: 1297
Suppose i have a DataFrame:
df = pd.DataFrame({'DATE_1':['2010-11-06', '2010-10-07', '2010-09-07', '2010-05-07'],
'DATE_2':['2010-12-07', '2010-11-06', '2010-10-07', '2010-08-06']})
df['DATE_1'] = pd.to_datetime(df['DATE_1'])
df['DATE_2'] = pd.to_datetime(df['DATE_2'])
So it look like:
DATE_1 DATE_2
0 2010-11-06 2010-12-07
1 2010-10-07 2010-11-06
2 2010-09-07 2010-10-07
3 2010-05-07 2010-08-06
I want to create another column DIFF
which is diffrence of DATE_2
and DATE_1
in days or months or years.
I want to have an interface like the one, which is under these words, because i'll have to create a lot of columns, similar to DIFF
from a lot of DATE_X
columns:
def date_diffrence(x, y, parameter):
if !np.isnan(x):
return (x-y)
df['DIFF'] = df.apply(date_diffrence(df['DATE_2'], df['DATE_1']))
According to this post: Difference between map, applymap and apply methods in Pandas, it seems to me, that i'm not able to create such a universal interface. Am i right?
Upvotes: 2
Views: 376
Reputation: 863176
It seems you need function without apply
with Series
(columns of df
) as arguments with dt.days
:
def date_diffrence_days(x, y):
return (x-y).dt.days
df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1'])
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 31
1 2010-10-07 2010-11-06 30
2 2010-09-07 2010-10-07 30
3 2010-05-07 2010-08-06 91
What is same as:
df['DIFF'] = (df['DATE_2'] - df['DATE_1']).dt.days
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 31
1 2010-10-07 2010-11-06 30
2 2010-09-07 2010-10-07 30
3 2010-05-07 2010-08-06 91
EDIT:
def date_diffrence_days(x, y, parameter):
if parameter == 'm':
return (x-y).dt.days
elif parameter == 's':
return (x-y).dt.total_seconds()
df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1'], 's')
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 2678400.0
1 2010-10-07 2010-11-06 2592000.0
2 2010-09-07 2010-10-07 2592000.0
3 2010-05-07 2010-08-06 7862400.0
Upvotes: 1