Reputation:
I would like to add a column to the second level of a multiindex column dataframe.
In [151]: df
Out[151]:
first bar baz
second one two one two
A 0.487880 -0.487661 -1.030176 0.100813
B 0.267913 1.918923 0.132791 0.178503
C 1.550526 -0.312235 -1.177689 -0.081596
The usual trick of direct assignment does not work:
In [152]: df['bar']['three'] = [0, 1, 2]
In [153]: df
Out[153]:
first bar baz
second one two one two
A 0.487880 -0.487661 -1.030176 0.100813
B 0.267913 1.918923 0.132791 0.178503
C 1.550526 -0.312235 -1.177689 -0.081596
How can I add the third row to under "bar"?
Upvotes: 112
Views: 103933
Reputation: 30032
If you want to add multiple columns to a multiindex column dataframe, you can try
df[[("foo", "bar1"), ("foo", "bar2")]] = 2
bar baz foo
one two one two bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 2 2
1 0.267913 1.918923 0.132791 0.178503 2 2
2 1.550526 -0.312235 -1.177689 -0.081596 2 2
df[[("foo", "bar1"), ("foo", "bar2")]] = [2, 3]
bar baz foo
one two one two bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 2 3
1 0.267913 1.918923 0.132791 0.178503 2 3
2 1.550526 -0.312235 -1.177689 -0.081596 2 3
df[[("foo", "bar1"), ("foo", "bar2")]] = np.array([range(len(df)) for _ in range(2)]).T
bar baz foo
one two one two bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 0 0
1 0.267913 1.918923 0.132791 0.178503 1 1
2 1.550526 -0.312235 -1.177689 -0.081596 2 2
df[[("foo", "bar1"), ("foo", "bar2")]] = [[1, 2],
[3, 4],
[5, 6]] # shape is (3, 2) where 3 is index length and 2 is new added column length
bar baz foo
one two one two bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 1 2
1 0.267913 1.918923 0.132791 0.178503 3 4
2 1.550526 -0.312235 -1.177689 -0.081596 5 6
Another usecase is that we have a single index dataframe, and we want to concat it to the multi index dataframe
bar baz
one two one two concat to bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 <--------- 0 1 2
1 0.267913 1.918923 0.132791 0.178503 1 3 4
2 1.550526 -0.312235 -1.177689 -0.081596 2 5 6
df[[("foo", col) for col in single_index_df.columns]] = single_index_df
bar baz foo
one two one two bar1 bar2
0 0.487880 -0.487661 -1.030176 0.100813 1 2
1 0.267913 1.918923 0.132791 0.178503 3 4
2 1.550526 -0.312235 -1.177689 -0.081596 5 6
df = df.join(pd.DataFrame(single_index_df.values,
columns=pd.MultiIndex.from_product([['foo'], single_index_df.columns]),
index=single_index_df.index))
pd.concat({'foo': single_index_df}, axis=1)
df = pd.concat([df, pd.concat({'foo': single_index_df}, axis=1)], axis=1)
# or
df = df.join(pd.concat({'foo': single_index_df}, axis=1))
Upvotes: 5
Reputation: 210842
If we want to add a multi-level column:
Source DF:
In [221]: df
Out[221]:
first bar baz
second one two one two
A -1.089798 2.053026 0.470218 1.440740
B 0.488875 0.428836 1.413451 -0.683677
C -0.243064 -0.069446 -0.911166 0.478370
Option 1: adding result of division: bar / baz
as a new foo
column
In [222]: df = df.join(
...: df[['bar']].div(df['baz']).rename(columns={'bar':'foo'}))
In [223]: df
Out[223]:
first bar baz foo
second one two one two one two
A -1.089798 2.053026 0.470218 1.440740 -2.317647 1.424980
B 0.488875 0.428836 1.413451 -0.683677 0.345873 -0.627250
C -0.243064 -0.069446 -0.911166 0.478370 0.266761 -0.145172
Option 2: adding multi-level column with three "sub-columns":
In [235]: df = df.join(pd.DataFrame(
...: np.random.rand(3,3),
...: columns=pd.MultiIndex.from_product([['new'], ['one','two','three']]),
...: index=df.index))
In [236]: df
Out[236]:
first bar baz new
second one two one two one two three
A -1.089798 2.053026 0.470218 1.440740 0.274291 0.636257 0.091048
B 0.488875 0.428836 1.413451 -0.683677 0.668157 0.456931 0.227568
C -0.243064 -0.069446 -0.911166 0.478370 0.333824 0.363060 0.949672
Upvotes: 28
Reputation: 2636
If you want to insert
(instead of append at the end of the DF) do this:
df.insert(0, ('bar', 'three'), [0, 1, 2])
The second item has to be hashable, so a list will not work.
Upvotes: 1
Reputation: 9676
It's actually pretty simple (FWIW, I originally thought to do it your way):
df['bar', 'three'] = [0, 1, 2]
df = df.sort_index(axis=1)
print(df)
bar baz
one two three one two
A -0.212901 0.503615 0 -1.660945 0.446778
B -0.803926 -0.417570 1 -0.336827 0.989343
C 3.400885 -0.214245 2 0.895745 1.011671
Upvotes: 127