Monthly climatology across several years, repeated for each day in that month over all years

Question

I need to find the monthly climatology of some data that has daily values across several years. The code below sufficiently summarizes what I am trying to do. monthly_mean holds the averages over all years for specific months. I then need to assign that average in a new column for each day in a specific month over all of the years. For whatever reason, my assignment, df['A Climatology'] = group['A Climatology'], is only assigning values to the month of December. How can I make the assignment happen for all months?

data = np.random.randint(5,30,size=(365*3,3))
df = pd.DataFrame(data, columns=['A', 'B', 'C'], index=pd.date_range('2021-01-01', periods=365*3))
df['A Climatology'] = np.nan

monthly_mean = df['A'].groupby(df.index.month).mean()
for month, group in df.groupby(df.index.month):
    group['A Climatology'] = monthly_mean.loc[month]
    df['A Climatology'] = group['A Climatology']
    
df

YoungTim · Accepted Answer

Your code is setting the column == to the group, so every iteration of your loop you're setting the df's values only for that group---which is why your df ends on December, the last month in the list.

monthly_mean = df['A'].groupby(df.index.month).mean()
for month, group in df.groupby(df.index.month):
    df.loc[lambda df: df.index.month == month, 'A Climatology'] = monthly_mean.loc[month]

Instead, you could directly set the df's values where the month == the iterable month.

Monthly climatology across several years, repeated for each day in that month over all years

Answers (2)

Related Questions