Reputation: 25
I am trying to do a calendar lag to a time series data which i am trying to analyse. • For x month lags, return the value x number of months ago. (3 month lag from 17th May would be data on the 17th Feb).
Is there any solution available directly in pandas.
df_list = pd.date_range(start=min_date, end=max_date,freq='D').to_frame(index=False)
df_list.columns = ['name']
df_list.set_index('name',inplace = True)
df = df_list.reindex(df_list.index - pd.to_timedelta(30, unit='d'),
method='nearest')
This code is not able to return correctly for 01-march with 1 month lag which should ideally be 1-Feb.
And Mar 31, 30, 29, 28 should be Feb 28 (all using a 1-month lag).
Upvotes: 2
Views: 845
Reputation: 59549
This is pd.offsets.DateOffset(months=1)
. It even gets leap years correct. It makes use of dateutil.relavitedelta
logic, which does the date subtraction as you need.
import pandas as pd
dates = pd.to_datetime(['2018-03-01', '2018-03-31', '2018-03-30',
'2018-03-29', '2018-03-28', '2016-03-31'])
df = pd.DataFrame({'dates': dates})
# dates
#0 2018-03-01
#1 2018-03-31
#2 2018-03-30
#3 2018-03-29
#4 2018-03-28
#5 2016-03-31
df.dates - pd.offsets.DateOffset(months=1)
#0 2018-02-01
#1 2018-02-28
#2 2018-02-28
#3 2018-02-28
#4 2018-02-28
#5 2016-02-29
#Name: dates, dtype: datetime64[ns]
Upvotes: 1