Unique IDs for sorted groups of indices

Question

I have something like the following DataFrame where I have data points at 2 locations in 4 seasons in 2 years.

>>> df=pd.DataFrame(index=pd.MultiIndex.from_product([[1,2,3,4],[2011,2012],['A','B']], names=['Season','Year','Location']))
>>> df['Value']=np.random.randint(1,100,len(df))
>>> df
                      Value
Season Year Location       
1      2011 A            40
            B             7
       2012 A            81
            B            84
2      2011 A            37
            B            59
       2012 A            30
            B             6
3      2011 A            71
            B            43
       2012 A             3
            B            65
4      2011 A            45
            B            13
       2012 A            38
            B            70
>>>

I would like to create a new series that represents that number of the season sorted by year. For example, the seasons in the first year would just be 1,2,3,4 and then the seasons in the second year would be 5,6,7,8. The series would look like this:

Season  Year  Location
1       2011  A           1
              B           1
        2012  A           5
              B           5
2       2011  A           2
              B           2
        2012  A           6
              B           6
3       2011  A           3
              B           3
        2012  A           7
              B           7
4       2011  A           4
              B           4
        2012  A           8
              B           8
Name: SeasonNum, dtype: int64
>>>

Any suggestions on the best way to do this?

Stefan · Accepted Answer

You could do:

def seasons(row):
    return row['Year'] % 2011 * 4 + row['Season']

df.reset_index(inplace=True)
df['seasons'] = df.apply(seasons, axis=1)
df.set_index(['Season', 'Year', 'Location'], inplace=True)

Unique IDs for sorted groups of indices

Answers (1)

Related Questions