Reputation: 564
How to calculate using pandas weeks between two dates such 2019-12-15 and 2019-12-28
Data:
cw = pd.DataFrame({ "lead_date" : ["2019-12-28" , "2019-12-23"] ,
"Received_date" : ["2019-12-15" , "2019-12-21" ] })
So I could do something like
cw["weeks_between"]= ( cw["lead_date"] - cw["Received_date"]) / 7
The problem is..
For row 1:
it will return 1.85, but is wrong value because one day starts in on beginning of week Vs End of week
For row 2: It will return 0.28, but also wrong because one day starts end of week Vs beginning of week.
-
Upvotes: 1
Views: 823
Reputation: 471
You need to use convert to datetime using pandas
import pandas as pd
import numpy as np
df = pd.DataFrame({ "lead_date" : ["2019-12-28" , "2019-12-23"] ,
"Received_date" : ["2019-12-15" , "2019-12-21" ] })
df['lead_date']=pd.to_datetime(df['lead_date'])
df['Received_date']=pd.to_datetime(df['Received_date'])
Here is the difference in days between "lead_date" and "Received_date"
df['time_between'] =df['lead_date']-df['Received_date']
print(df.head())
lead_date Received_date time_between
0 2019-12-28 2019-12-15 13 days
1 2019-12-23 2019-12-21 2 days
Update: edits below to get number of weeks. Also added import pandas and numpy.
To get 'time_between' column in weeks:
df['time_between']= df['time_between']/np.timedelta64(1,'W')
will yield
lead_date Received_date time_between
0 2019-12-28 2019-12-15 1.857143
1 2019-12-23 2019-12-21 0.285714
Update 2: If you want week number subtractions and not days between then use:
df['lead_date']=pd.to_datetime(df['lead_date']).dt.week
df['Received_date']=pd.to_datetime(df['Received_date']).dt.week
df['time_between'] =df['lead_date']-df['Received_date']
yields,
lead_date Received_date time_between
0 52 50 2
1 52 51 1
.dt.week returns week number in the year.
Upvotes: 2
Reputation: 42886
list comprehension
, dt.period
& getattr
provided by Jon Clements
in comments
This method will work when years change between the compared dates:
cw['weeks_diff'] = (
[getattr(el, 'n', 0)
for el in cw['lead_date'].dt.to_period('W') - cw['Received_date'].dt.to_period('W')]
)
dt.strftime('%W')
We can use pd.to_datetime
to convert your dates to datetime. Then we use the dt.strftime
accessor to get the weeknumbers with %W
.
Finally we substract both weeknumbers:
weeks = (cw[['lead_date', 'Received_date']]
.apply(lambda x: pd.to_datetime(x).dt.strftime('%W'))
.replace('NaT', 0)
.astype(int)
)
cw['weeks_diff'] = weeks['lead_date'] - weeks['Received_date']
lead_date Received_date weeks_diff
0 2019-12-28 2019-12-15 2
1 2019-12-23 2019-12-21 1
Upvotes: 3