A User
A User

Reputation: 862

Pandas: get multiindex level as series

I have a dataframe with multiple levels, eg:

idx = pd.MultiIndex.from_product((['foo', 'bar'], ['one', 'five', 'three' 'four']),
                                 names=['first', 'second'])
df = pd.DataFrame({'A': [np.nan, 12, np.nan, 11, 16, 12, 11, np.nan]}, index=idx).dropna().astype(int)

              A     
first second
foo   five     12
      four     11
bar   one      16
      five     12
      three    11

I want to create a new column using the index level titled second, so that I get

              A    B  
first second
foo   five     12   five
      four     11   four
bar   one      16   one
      five     12   five
      three    11   three

I can do this by resetting the index, copying the column, then re-applying, but that seems more round-about.

I tried df.index.levels[1], but that creates a sorted list, it doesn't preserve the order.

If it was a single index, I would use df.index but in a multiindex that creates a column of tuples.

If this is resolved elsewhere, please share as I haven't had any luck searching the stackoverflow archives.

Upvotes: 16

Views: 11087

Answers (3)

questionto42
questionto42

Reputation: 9630

If you want to fetch the index column values with the index name (instead of the numeric index), I may borrow this from @AlbertoGarcia-Raboso's answer.

Mind that this gives you an output which still includes the index columns, it is a series, as the question asks for. This looks like a repeated column at first.

df.index.to_frame()['second']

(and then for example ask the 9th series item with df.index.to_frame()['second'][8])

Upvotes: 0

Alexander
Alexander

Reputation: 109726

df['B'] = df.index.get_level_values(level=1)  # Zero based indexing.
# df['B'] = df.index.get_level_values(level='second')  # This also works.
>>> df
               A      B
first second           
foo   one     12    one
      two     11    two
bar   one     16    one
      two     12    two
      three   11  three

Upvotes: 17

Alicia Garcia-Raboso
Alicia Garcia-Raboso

Reputation: 13913

df['B'] = idx.to_series().str[1]

Upvotes: 4

Related Questions