Amir
Amir

Reputation: 1087

Multi index series into pandas dataframe

I have a pandas series with two indices:

df_agg=df.groupby(['yearID','teamID']).sum()['Salary']
df_agg.head()

yearID  teamID
1985    ATL       14807000
        BAL       11560712
        BOS       10897560
        CAL       14427894
        CHA        9846178

I want to convert int into a pandas data frame like

yearID     teamID    Salary
1985        ATL       14807000
1985        BAL       11560712
1985        BOS       10897560
1985        CAL       14427894
1985        CHA        9846178

I tired to use:

df_new=df_agg.reset_index(inplace=True)

but I got the following error:


TypeError Traceback (most recent call last) in () ----> 1 df_new=df_agg.reset_index(inplace=True)

C:\Users\ameimand\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\series.py in reset_index(self, level, drop, name, inplace) 966

    index=new_index).__finalize__(self)
    967         elif inplace:
--> 968             raise TypeError('Cannot reset_index inplace on a Series '
    969                             'to create a DataFrame')
    970         else:

TypeError: Cannot reset_index inplace on a Series to create a DataFrame

Upvotes: 6

Views: 14232

Answers (2)

Nikolai Zaitsev
Nikolai Zaitsev

Reputation: 323

You need to convert Series to DataFrame, first. I think, this will work:

df_new = df_agg.to_frame().reset_index()

Also, inplace=True flag is used to modify DataFrame without assignment, like:

df_new = df_agg.to_frame()
df_new.reset_index(inplace=True)

Upvotes: 0

jezrael
jezrael

Reputation: 862911

I think there are 2 nice solutions parameter as_index=False:

df_new = df.groupby(['yearID','teamID'], as_index=False)['Salary'].sum()

Or reset_index without inplace=True:

df_new = df.groupby(['yearID','teamID'])['Salary'].sum().reset_index()

Notice:

Better is specify column for aggregation after groupby in [] like ['Salary']:

df.groupby(['yearID','teamID'], as_index=False)['Salary']

as:

df.groupby(['yearID','teamID']).sum()['Salary']

because this aggregate all columns and then select only Salary.

Upvotes: 9

Related Questions