Reputation: 3782
I have a pandas DataFrame with the following content:
df =
start end
01/April 02/May
12/April 12/April
I need to add a column with the difference (in days) between end
and start
values (end - start
).
How can I do it?
I tried the following:
import pandas as pd
df.startdate = pd.datetime(df.start, format='%B/%d')
df.enddate = pd.datetime(df.end, format='%B/%d')
But not sure if this is a right direction.
Upvotes: 0
Views: 28
Reputation: 82805
import pandas as pd
df = pd.DataFrame({"start":["01/April", "12/April"], "end": ["02/May", "12/April"]})
df["start"] = pd.to_datetime(df["start"])
df["end"] = pd.to_datetime(df["end"])
df["diff"] = (df["end"] - df["start"])
Output:
end start diff
0 2018-05-02 2018-04-01 31 days
1 2018-04-12 2018-04-12 0 days
Upvotes: 1
Reputation: 164843
This is one way.
df['start'] = pd.to_datetime(df['start']+'/2018', format='%d/%B/%Y')
df['end'] = pd.to_datetime(df['end']+'/2018', format='%d/%B/%Y')
df['diff'] = df['end'] - df['start']
# start end diff
# 0 2018-04-01 2018-05-02 31 days
# 1 2018-04-12 2018-04-12 0 days
Upvotes: 1