Reputation: 2117
I am trying to select data from a series between two dates. It's working alright using df.loc and mask but I want to make it automatically get current year data and last year data without entering it manually.
f is the input
f.head(3)
Out[37]:
0 2011-08-02
1 2011-08-12
2 2011-08-15
Name: receiveddate, dtype: datetime64[ns]
Then my code is
start_date2014 = datetime.datetime(2014, 4, 1)
end_date2014 = datetime.datetime(2014, 3, 31)
mask2014 = (f >= start_date2014) & (f <= end_date2014)
DisputesFY2014 = f.loc[mask2014]
DisputesFY2014 = DisputesFY2014.value_counts()
I was thinking of using pandas and yearend and yearbegin, but I am getting errors in timestamp syntax. I tried:
start_date2015 = pd.tseries.offsets.YearBegin(1)#datetime.datetime(2015, 4, 1)
start_date2015 = start_date2015.to_timestamp
and got AttributeError: 'YearBegin' object has no attribute 'to_timestamp'
But I didn't have to_timestamp before and the error was ValueError: Cannot convert Period to Timestamp unambiguously. Use to_timestamp
I'm guessing that there is an easy way to do this that I am completely missing.
Upvotes: 0
Views: 1480
Reputation: 879321
To select all rows from f
of the current year and last year:
year = pd.datetime.now().year
mask = f.dt.year.isin([year-1, year])
f.loc[mask]
Alternatively, you can obtain the current year using:
In [119]: pd.to_datetime('now').year
Out[119]: 2015
Offsets, such as pd.tseries.offsets.YearBegin(1)
are used to add or subtract amounts of time from Timestamps:
In [122]: pd.to_datetime('now')
Out[122]: Timestamp('2015-08-20 17:40:59')
In [123]: pd.to_datetime('now') + pd.tseries.offsets.YearBegin(1)
Out[123]: Timestamp('2016-01-01 17:41:04')
The offsets are not themselves dates.
Upvotes: 1