matthewmturner
matthewmturner

Reputation: 606

Pandas rolling mean by year

I have a dataframe with the following structure:

Index: DatetimeIndex Columns: Client, Business, Balances

My data goes across multiple years. I am trying to get the rolling mean of Balances by year. Meaning I want a variable length window which on Jan 1st of the year will be a window of size 1 and on Dec 31st will be window of size 365. Then the following year it starts over at 1 and builds from there. I am also grouping by the Client and Business.

I have tried the following:

df.groupby(["Client", "Business"]).rolling("AS")["Balances"].mean()

I also tried rolling with "A".

I am getting the error:

ValueError: <YearBegin: month=1> is a non-fixed frequency

From the documentation I thought rolling would accept and offset for variable length window but they year ones im trying arent working.

Whats the right way to do this?

Upvotes: 1

Views: 470

Answers (1)

Nilesh Ingle
Nilesh Ingle

Reputation: 1883

The error is likely because the number of days in a year is not fixed. Alternative could be to use 365d instead as in code below using .expanding() window and pd.Grouper() here. Assuming the expected output looks as below as it is not posted in the question.

# Import libraries
import pandas as pd

# Create DataFrame
dt = [
    '2020-05-29',
    '2020-05-29',
    '2020-01-29',
    '2020-01-29',
    '2020-02-29',
    '2020-03-29',
    '2020-06-29',
    '2020-06-29'
]
df = pd.DataFrame({
    'dt':times
})
df['dt'] = pd.to_datetime(df['dt'])
df['Balances'] = np.random.randint(0,100,8)
df['Client'] = ['A','B','C','A']*2
df['Business'] = ['x','y','z','x']*2


# Groupby
df.groupby([pd.Grouper(key='dt', freq='365d'), "Client", "Business"])['Balances'].expanding().mean()

Output

dfg

dt          Client  Business   
2020-06-29  A       x         0    44.0
                              3    51.0
                              4    63.0
                              7    36.0
            B       y         1    54.0
                              5    13.0
            C       z         2    43.0
                              6    48.0

Upvotes: 1

Related Questions