Dmitry  Sokolov
Dmitry Sokolov

Reputation: 1383

Reset index after groupby operation

by_month =  df_omsk_last_year.groupby(df_omsk_last_year.index.month, as_index=False).agg({'T': ['mean', 'min', 'max']})
by_month = by_month.reset_index()
by_month = by_month.rename(columns={'mean':'mean__'})
by_month.info()
by_month['mean__']

I have key error, of course.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 4 columns):
(index, )      12 non-null int64
(T, mean__)    12 non-null float64
(T, min)       12 non-null float64
(T, max)       12 non-null float64
dtypes: float64(3), int64(1)
memory usage: 464.0 bytes

What I should do? I have tried a lot of ways.

index is datetime, T is float.

Upvotes: 0

Views: 798

Answers (1)

jezrael
jezrael

Reputation: 862641

Problem is MultiIndex in columns with same level T. You can prevent it by specify column after groupby for processing:

df_omsk_last_year = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'T':[7,8,9,4,2,3],

}, index=pd.date_range('2015-01-01', periods=6, freq='10D'))

by_month = (df_omsk_last_year.groupby(df_omsk_last_year.index.month.rename('month'))['T']
                              .agg(['mean', 'min', 'max'])
                              .rename(columns={'mean':'mean__'})
                              .reset_index())
print (by_month)
   month  mean__  min  max
0      1     7.0    4    9
1      2     2.5    2    3

Or by named aggregations:

by_month =  (df_omsk_last_year.groupby(df_omsk_last_year.index.month)
                              .agg(mean__=('T', 'mean'),
                                   min__=('T', 'min'),
                                   max__=('T', 'max'))
                              .reset_index())

Upvotes: 2

Related Questions