Reputation: 6524
I have a Series, called 'scores', with a datetime index.
I wish to subset it by quarter
and year
pseudocode: series.loc['q2 of 2013']
Attempts so far:
s.dt.quarter
AttributeError: Can only use .dt accessor with datetimelike values
s.index.dt.quarter
AttributeError: 'DatetimeIndex' object has no attribute 'dt'
This works (inspired by this answer), but I can't believe it is the right way to do this in Pandas:
d = pd.DataFrame(s)
d['date'] = pd.to_datetime(d.index)
d.loc[(d['date'].dt.quarter == 2) & (d['date'].dt.year == 2013)]['scores']
I expect there is a way to do this without transforming into a dataset, forcing the index into datetime, and then getting a Series from it.
What am I missing, and what is the elegant way to do this on a Pandas series?
Upvotes: 5
Views: 7837
Reputation: 10335
For future-comers, just skip the .dt
with a DateTimeIndex
and use s.quarter
instead of s.dt.quarter
. The other answers are way too long for this.
Upvotes: 0
Reputation: 13913
import numpy as np
import pandas as pd
index = pd.date_range('2013-01-01', freq='M', periods=12)
s = pd.Series(np.random.rand(12), index=index)
print(s)
# 2013-01-31 0.820672
# 2013-02-28 0.994890
# 2013-03-31 0.928376
# 2013-04-30 0.848532
# 2013-05-31 0.122263
# 2013-06-30 0.305741
# 2013-07-31 0.088432
# 2013-08-31 0.647288
# 2013-09-30 0.640308
# 2013-10-31 0.737139
# 2013-11-30 0.233656
# 2013-12-31 0.245214
# Freq: M, dtype: float64
d = pd.Series(s.index, index=s.index)
quarter = d.dt.quarter.astype(str) + 'Q' + d.dt.year.astype(str)
print(quarter)
# 2013-01-31 1Q2013
# 2013-02-28 1Q2013
# 2013-03-31 1Q2013
# 2013-04-30 2Q2013
# 2013-05-31 2Q2013
# 2013-06-30 2Q2013
# 2013-07-31 3Q2013
# 2013-08-31 3Q2013
# 2013-09-30 3Q2013
# 2013-10-31 4Q2013
# 2013-11-30 4Q2013
# 2013-12-31 4Q2013
# Freq: M, dtype: object
print(s[quarter == '1Q2013'])
# 2013-01-31 0.124398
# 2013-02-28 0.052828
# 2013-03-31 0.126374
# Freq: M, dtype: float64
If you don't want to create a new Series that holds a label for each quarter (e.g., if you are subsetting just once), you could even do
print(s[(s.index.quarter == 1) & (s.index.year == 2013)])
# 2013-01-31 0.124398
# 2013-02-28 0.052828
# 2013-03-31 0.126374
# Freq: M, dtype: float64
Upvotes: 3
Reputation: 294218
If you know the the year and quarter, say Q2 2013, then you can do this:
s['2013-04':'2013-06']
Wrap it up into a function:
qmap = pd.DataFrame([
('01', '03'), ('04', '06'), ('07', '09'), ('10', '12')
], list('1234'), list('se')).T
def get_quarter(df, year, quarter):
s, e = qmap[str(quarter)]
y = str(year)
s = y + '-' + s
e = y + '-' + e
return df[s:e]
and call it:
get_quarter(s, 2013, 2)
suppose s
is:
s = pd.Series(range(32), pd.date_range('2011-01-01', periods=32, freq='Q'))
Then I get:
2013-03-31 8
Freq: Q-DEC, dtype: int64
Upvotes: 0
Reputation: 7828
Suppose you have a dataframe like this:
sa
Out[28]:
0
1970-01-31 1
1970-02-28 2
1970-03-31 3
1970-04-30 4
1970-05-31 5
1970-06-30 6
1970-07-31 7
1970-08-31 8
1970-09-30 9
1970-10-31 10
1970-11-30 11
1970-12-31 12
If the index is datetime then you can get the quarter as sa.index.quarter
:
sa.index.quarter
Out[30]: array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4])
Upvotes: 3