user6435943
user6435943

Reputation: 199

Appending level of column labels to MultiIndex

I am trying to change a single level of column labels within a MultiIndex.

For example,

test = pd.DataFrame(np.random.random((4,4)))
test.columns = pd.MultiIndex.from_product([['Level1'],['A','B','C','D'],['Level3']])

Out: 
     Level1                              
          A         B         C         D
     Level3    Level3    Level3    Level3
0  0.153388  0.253070  0.338756  0.025598
1  0.818954  0.169352  0.851079  0.823263
2  0.535703  0.432627  0.690446  0.599997
3  0.304654  0.919936  0.095747  0.404449

I would like to change the 'Level 3' labels to ['1','2','3','4'] but cannot find a clean way of doing it.

I have tried the below which gives an iterable of 16 elements so not what I need.

test.columns = pd.MultiIndex.from_product([['Level1'],['A','B','C','D'],['1','2','3','4']])

The only workaround I found was defining each level manually at the beginning and redefining the MultiIndex

eg,

level1 = ['Level1','Level1','Level1','Level1']
level2 = ['A','B','C','D']
level3 = ['1','2','3','4']
test = pd.DataFrame(np.random.random((4,4)),columns=[level1,level2,level3])

Is there a neater solution? I am working with large data sets so the above is very cumbersome.

Upvotes: 3

Views: 487

Answers (2)

michael_j_ward
michael_j_ward

Reputation: 4559

Alternatively, you could use pd.MultiIndex.from_tuples

test = pd.DataFrame(np.random.random((4,4)))
index_tuples = [('Level1',letter,number) for letter,number in zip(['A','B','C','D'],range(1,4+1))]
test.columns = pd.MultiIndex.from_tuples(index_tuples)

Upvotes: 1

EdChum
EdChum

Reputation: 393903

IIUC you need to set the level values and then the labels in 2 steps:

In [153]:
test.columns = test.columns.set_levels(['0','1','2','3'],level=2)
test.columns = test.columns.set_labels([0,1,2,3],level=2)
test

Out[153]:
     Level1                              
          A         B         C         D
          0         1         2         3
0  0.122865  0.778640  0.582170  0.695648
1  0.051477  0.479084  0.150539  0.143929
2  0.362087  0.285109  0.465092  0.205157
3  0.963744  0.730001  0.148460  0.474678

The reason is because initially your third level all have the same label (0) as the column is repeated:

In [155]:
test.columns

Out[155]:
MultiIndex(levels=[['Level1'], ['A', 'B', 'C', 'D'], ['Level3']],
           labels=[[0, 0, 0, 0], [0, 1, 2, 3], [0, 0, 0, 0]])

but what you want is to rename the levels and the labels resulting in the following:

In [158]:
test.columns

Out[158]:
MultiIndex(levels=[['Level1'], ['A', 'B', 'C', 'D'], ['0', '1', '2', '3']],
           labels=[[0, 0, 0, 0], [0, 1, 2, 3], [0, 1, 2, 3]])

So you can either reconstruct the multi-index again as you've tried already or set the level values and then the label values as I've shown above

Upvotes: 1

Related Questions