jenryb
jenryb

Reputation: 2117

How to select a date range for current and last year in Python

I am trying to select data from a series between two dates. It's working alright using df.loc and mask but I want to make it automatically get current year data and last year data without entering it manually.

f is the input

f.head(3)
Out[37]: 
0   2011-08-02
1   2011-08-12
2   2011-08-15
Name: receiveddate, dtype: datetime64[ns]

Then my code is

start_date2014 = datetime.datetime(2014, 4, 1)
end_date2014 = datetime.datetime(2014, 3, 31)
mask2014 = (f >= start_date2014) & (f <= end_date2014)
DisputesFY2014 = f.loc[mask2014]
DisputesFY2014 = DisputesFY2014.value_counts()

I was thinking of using pandas and yearend and yearbegin, but I am getting errors in timestamp syntax. I tried:

start_date2015 = pd.tseries.offsets.YearBegin(1)#datetime.datetime(2015, 4, 1)
start_date2015 = start_date2015.to_timestamp

and got AttributeError: 'YearBegin' object has no attribute 'to_timestamp' But I didn't have to_timestamp before and the error was ValueError: Cannot convert Period to Timestamp unambiguously. Use to_timestamp I'm guessing that there is an easy way to do this that I am completely missing.

Upvotes: 0

Views: 1480

Answers (1)

unutbu
unutbu

Reputation: 879321

To select all rows from f of the current year and last year:

year = pd.datetime.now().year
mask = f.dt.year.isin([year-1, year])
f.loc[mask]

Alternatively, you can obtain the current year using:

In [119]: pd.to_datetime('now').year
Out[119]: 2015

Offsets, such as pd.tseries.offsets.YearBegin(1) are used to add or subtract amounts of time from Timestamps:

In [122]: pd.to_datetime('now')
Out[122]: Timestamp('2015-08-20 17:40:59')

In [123]: pd.to_datetime('now') + pd.tseries.offsets.YearBegin(1)
Out[123]: Timestamp('2016-01-01 17:41:04')

The offsets are not themselves dates.

Upvotes: 1

Related Questions