Adding a column in a multi-indexed dataframe

Question

I have a multi-indexed dataframe, where the left-most index is NBA Player, and the second level index is NBA Season (i.e. 2018-19). I'd like to add a column that numbers each players season. For example on the head of the dateframe below, I'd like to add a column next to season that lists A.J. Guyton's 2000-01 season as '1' and his 2001-02 season as '2'. Then the process would repeat for the next player throughout the dataframe.

                     Age   Tm  OBPM  BPM  DBPM
Player      Season                            
A.J. Guyton 2000-01   22  CHI -0.57 -2.8  -2.1
            2001-02   23  CHI -0.80 -3.4  -2.4
A.J. Price  2009-10   23  IND -0.75 -2.2  -1.1
            2010-11   24  IND -1.51 -3.1  -1.0
            2011-12   25  IND -0.35 -2.2  -1.4

I'm new to pandas and relatively new to Python altogether, so this is likely a simple question but I'm not sure how to even approach it since every player's start year is different.

David Nehme · Accepted Answer

You can use the split/apply/combine pattern with groupby and cumcount. The cumcount acts as a transform which returns a series with the same index as the original dataframe in contrast with an aggregation (like mean) which returns one value for each group.

df['career_year'] = df.groupby(level='Player').cumcount()

With your data, this will give

                     Age   Tm  OBPM  BPM  DBPM  career_year
Player      Season                                         
A.J. Guyton 2000-01   22  CHI -0.57 -2.8  -2.1            0
            2001-02   23  CHI -0.80 -3.4  -2.4            1
A.J. Price  2009-10   23  IND -0.75 -2.2  -1.1            0
            2010-11   24  IND -1.51 -3.1  -1.0            1
            2011-12   25  IND -0.35 -2.2  -1.4            2

Adding a column in a multi-indexed dataframe

Answers (2)

Related Questions