Reputation: 237
I need to groupby week, however a week like this one (the first week of the year) spans two years, 2018 and 2019.
Typically I would do the following:
df.groupby([df.DATE.dt.year,df.DATE.dt.week]).sum()
which results in the single week characterized as two separate weeks in the output. I am sure I can brute force with IF statements, however I am wondering if there is a more clean way to group by week during these year transitions.
Upvotes: 4
Views: 1066
Reputation: 323226
Or just using resample
df.set_index('Date').resample('W-SUN').Data.mean()
Date
2018-12-30 1.000000
2019-01-06 1.833333
Freq: W-SUN, Name: Data, dtype: float64
Upvotes: 0
Reputation: 109546
You can convert the dates to pandas Period objects, and then group on them.
df = pd.DataFrame(
{'Date': pd.DatetimeIndex(start='2018-12-24', end='2019-01-05', freq='d'),
'Data': [1] * 8 + [2] * 5})
>>> df
Date Data
0 2018-12-24 1
1 2018-12-25 1
2 2018-12-26 1
3 2018-12-27 1
4 2018-12-28 1
5 2018-12-29 1
6 2018-12-30 1
7 2018-12-31 1
8 2019-01-01 2
9 2019-01-02 2
10 2019-01-03 2
11 2019-01-04 2
12 2019-01-05 2
>>> (df
.assign(period=pd.PeriodIndex(df['Date'], freq='W-Sun')) # Weekly periods ending Sundays.
.groupby('period')['Data'].mean())
period
2018-12-24/2018-12-30 1.000000
2018-12-31/2019-01-06 1.833333 # (1 * 1 + 2 * 5) / 6 = 1.833
Freq: W-SUN, Name: Data, dtype: float64
Note that there are only six days in the final period in the example above.
Upvotes: 4