Reputation: 23
I have a multi-index data frame with rows "bar" and "baz" and each of this rows has a row "one" and "two". I now want to add a row "three" to each row "bar" and foo".
Is there an elegant way to do so?
For example:
import pandas as pd
import numpy as np
arrays = [["bar", "bar", "baz", "baz"],
["one", "two", "one", "two"]]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
df = pd.DataFrame(np.random.randn(3, 4), index=["A", "B", "C"], columns=index)
In [38]: df
Out[38]:
first bar baz
second one two one two
A 0.357392 -1.880279 0.099014 1.354570
B 0.474572 0.442074 -1.173530 -1.362059
C -0.980140 -0.173440 -1.490654 -0.539123
and i want something like this:
first bar baz
second one two three one two three
A -0.096890 0.012150 nan -0.749569 -0.965033 nan
B -0.854206 0.118473 nan 0.263058 -0.025849 nan
C -0.688007 -0.258569 nan 0.127305 -0.955044 nan
Upvotes: 2
Views: 692
Reputation: 730
For a general answer
when you don't necessarily know the names of indices in level 0 and generally want to do this for every level 0 index:
First, we should create the NaN
matrix we want to inject. It has len(df)
number of rows and for columns we should find how many level 0 columns we have in the dataframe. After we create it, we make it a dataframe with the same indices and columns as our multindex dataframe. Note that for this dataframe we just need the levels[0]
of the original dataframe because for the next level we want to have 'three'
.
a = np.full((len(df),len(df.columns.levels[0])), np.nan)
inject_df = pd.DataFrame(a, index=df.index, columns=pd.MultiIndex.from_product([df.columns.levels[0], ['three']]))
inject_df
first bar baz
three three
A NaN NaN
B NaN NaN
C NaN NaN
At last, we concat the injected df with the original one and sort the index so that the ones sharing level(0)
indices fall beside each other.
result = pd.concat([df, inject_df], axis=1).sort_index(level=0, axis=1)
result
first bar baz
second one three two one three two
A -0.995944 NaN -0.437629 -0.629472 NaN 1.919711
B -0.402886 NaN 0.262420 0.117202 NaN -1.234542
C 1.281046 NaN -1.058977 0.447767 NaN 2.374122
Upvotes: 1
Reputation: 35155
I don't know how Python-like it is, but there are two ways to do this: simple substitution and using inserts.
df[('bar','three')] = np.NaN
df[('baz','three')] = np.NaN
df.insert(2,('bar','three'),np.NaN)
df.insert(5,('baz','three'),np.NaN)
first bar baz
second one two three one two three
A -0.973338 -0.233507 NaN 0.777288 -2.282688 NaN
B -0.377486 0.080627 NaN 0.401302 0.355696 NaN
C 0.481056 0.651335 NaN 0.161145 1.001937 NaN
Upvotes: 1