Siraj S.
Siraj S.

Reputation: 3751

pandas set index with multilevel columns

consider the below pd.DataFrame

df_index = pd.MultiIndex.from_product([['foo','bar'],['one','two','three']])
df = pd.DataFrame(np.random.randint(0,10,size=18, dtype='int').reshape((-1,6)), columns=df_index)

print(df)
                     foo                    bar
     one    two     three   one     two     three
   0    7   3         8       3     6         0
   1    2   5         9       4     3         6
   2    4   2         6       6     4         5

I desire to set 'foo' and all the sub-indices within it as index. How can I achieve this? I am grappling with 'set_index'and pd.IndexSlice but still can't get to the solution

Upvotes: 5

Views: 3108

Answers (2)

Confounded
Confounded

Reputation: 522

How about

df = df.set_index(df[['foo']].columns.to_list())

and if you want to strip out 'foo' from the index names

df.index.names = list(zip(*df.index.names))[1]

Upvotes: 0

Chris Adams
Chris Adams

Reputation: 18647

You need to pass all levels of a MultiIndex as a tuple. So the correct format should be:

df.set_index([('foo', 'one'), ('foo', 'two'), ('foo', 'three')])

If this is cumbersome, you could create your index using a list comprehension like:

idx = [x for x in df.columns if x[0] == 'foo']
print(idx)
#  [('foo', 'one'), ('foo', 'two'), ('foo', 'three')]

df.set_index(idx)

[out]

                                   bar          
                                   one two three
(foo, one) (foo, two) (foo, three)              
1          3          4              4   8     3
5          1          0              4   7     5
0          0          3              9   1     6

Upvotes: 2

Related Questions