How to use Pandas groupby week when the week number spans more than one year

Question

I need to groupby week, however a week like this one (the first week of the year) spans two years, 2018 and 2019.

Typically I would do the following:

df.groupby([df.DATE.dt.year,df.DATE.dt.week]).sum()

which results in the single week characterized as two separate weeks in the output. I am sure I can brute force with IF statements, however I am wondering if there is a more clean way to group by week during these year transitions.

Alexander · Accepted Answer

You can convert the dates to pandas Period objects, and then group on them.

df = pd.DataFrame(
    {'Date': pd.DatetimeIndex(start='2018-12-24', end='2019-01-05', freq='d'),
     'Data': [1] * 8 + [2] * 5})
>>> df
         Date  Data
0  2018-12-24     1
1  2018-12-25     1
2  2018-12-26     1
3  2018-12-27     1
4  2018-12-28     1
5  2018-12-29     1
6  2018-12-30     1
7  2018-12-31     1
8  2019-01-01     2
9  2019-01-02     2
10 2019-01-03     2
11 2019-01-04     2
12 2019-01-05     2

>>> (df
     .assign(period=pd.PeriodIndex(df['Date'], freq='W-Sun'))  # Weekly periods ending Sundays.
     .groupby('period')['Data'].mean())
period
2018-12-24/2018-12-30    1.000000
2018-12-31/2019-01-06    1.833333  # (1 * 1 + 2 * 5) / 6 = 1.833 
Freq: W-SUN, Name: Data, dtype: float64

Note that there are only six days in the final period in the example above.

How to use Pandas groupby week when the week number spans more than one year

Answers (2)

Related Questions