Simon
Simon

Reputation: 23

Pandas Multi index DataFrame add subindex to each index

I have a multi-index data frame with rows "bar" and "baz" and each of this rows has a row "one" and "two". I now want to add a row "three" to each row "bar" and foo".

Is there an elegant way to do so?

For example:

import pandas as pd
import numpy as np

arrays = [["bar", "bar", "baz", "baz"],
          ["one", "two", "one", "two"]]

tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
df = pd.DataFrame(np.random.randn(3, 4), index=["A", "B", "C"], columns=index)
In [38]: df
Out[38]: 
first        bar                 baz          
second       one       two       one       two
A       0.357392 -1.880279  0.099014  1.354570
B       0.474572  0.442074 -1.173530 -1.362059
C      -0.980140 -0.173440 -1.490654 -0.539123

and i want something like this:

first        bar                           baz                    
second       one       two     three       one       two     three
A      -0.096890  0.012150       nan -0.749569 -0.965033       nan
B      -0.854206  0.118473       nan  0.263058 -0.025849       nan
C      -0.688007 -0.258569       nan  0.127305 -0.955044       nan

Upvotes: 2

Views: 692

Answers (2)

Hoori M.
Hoori M.

Reputation: 730

For a general answer when you don't necessarily know the names of indices in level 0 and generally want to do this for every level 0 index:

First, we should create the NaN matrix we want to inject. It has len(df) number of rows and for columns we should find how many level 0 columns we have in the dataframe. After we create it, we make it a dataframe with the same indices and columns as our multindex dataframe. Note that for this dataframe we just need the levels[0] of the original dataframe because for the next level we want to have 'three'.

a = np.full((len(df),len(df.columns.levels[0])), np.nan)

inject_df = pd.DataFrame(a, index=df.index, columns=pd.MultiIndex.from_product([df.columns.levels[0], ['three']]))
inject_df

first  bar     baz
       three   three
A      NaN     NaN
B      NaN     NaN
C      NaN     NaN

At last, we concat the injected df with the original one and sort the index so that the ones sharing level(0) indices fall beside each other.

result = pd.concat([df, inject_df], axis=1).sort_index(level=0, axis=1)
result

first   bar                         baz
second  one    three    two         one        three    two
A    -0.995944  NaN   -0.437629    -0.629472    NaN    1.919711
B    -0.402886  NaN   0.262420      0.117202    NaN    -1.234542
C    1.281046   NaN   -1.058977     0.447767    NaN    2.374122

Upvotes: 1

r-beginners
r-beginners

Reputation: 35155

I don't know how Python-like it is, but there are two ways to do this: simple substitution and using inserts.

  1. Substitution
df[('bar','three')] = np.NaN
df[('baz','three')] = np.NaN 
  1. insert
df.insert(2,('bar','three'),np.NaN)
df.insert(5,('baz','three'),np.NaN)
first                   bar                     baz
second  one     two     three   one     two     three
A   -0.973338   -0.233507   NaN     0.777288    -2.282688   NaN
B   -0.377486   0.080627    NaN     0.401302    0.355696    NaN
C   0.481056    0.651335    NaN     0.161145    1.001937    NaN

Upvotes: 1

Related Questions