Reputation: 687
This is my original DataFrame (with multiindex column):
In [72]:df
Out[72]:
a b
x y x y
0 1.545293 -0.459270 0.899254 -1.010453
1 0.458760 0.275400 -0.190951 0.169195
2 -0.941817 1.109823 0.077953 -0.247074
3 1.790101 -1.643470 0.979625 -1.704657
4 -2.044814 -0.243726 -0.039724 0.600066
and I have another DataFrame:
In [77]:df2
Out[77]:
x y
0 -1.085869 -0.952949
1 0.601585 0.570050
2 0.328601 0.802610
3 -0.415952 -0.090088
4 0.757545 -0.736933
how can I add df2
's columns to df
to get a new DataFrame like this:
In [83]:df3
Out[83]:
a b c
x y x y x y
0 1.545293 -0.459270 0.899254 -1.010453 -1.085869 -0.952949
1 0.458760 0.275400 -0.190951 0.169195 0.601585 0.570050
2 -0.941817 1.109823 0.077953 -0.247074 0.328601 0.802610
3 1.790101 -1.643470 0.979625 -1.704657 -0.415952 -0.090088
4 -2.044814 -0.243726 -0.039724 0.600066 0.757545 -0.736933
My current approach is to use a for loop:
for col in df2.columns:
df['c', col] = df2[col]
is there any method to avoid the loop?
Upvotes: 7
Views: 4534
Reputation: 4855
First, add a column level ("c") to df2
. Then, concatenate both dataframes.
df2.columns = pd.MultiIndex.from_product(["c"], df.columns])
df = pd.concat([df, df2], axis=1)
Upvotes: 1
Reputation: 121
I think stacking is the best solution here:
# First stack df and df2
df, df2 = df.stack(), df2.stack()
# Then add df2 to df as a column
df2['c'] = df2
# Now unstack
df2 = df2.unstack()
Upvotes: 0
Reputation: 2666
I discovered another way to do this in the general case (running Python 3.6), without having to explicitly deconstruct the DataFrame
. You can use pd.concat
with the dictionary argument,
df3 = pd.concat({**df1, **{('c',nm):val for nm,val in df2.items()})
**
expansion on DataFrame
objects seems to return a dictionary of Series objects with "names" equal to the column name string/value, or if the columns are MultiIndexed, the tuple containing the hieararchy of column string/values. Then, when read back into pd.concat
as a dictionary, Pandas
re-constructs the MultiIndexed columns from the tuples.
Note this is much less efficient than the direct assignment you were doing! Since it has to deconstruct each column and MultiIndex of the dataframe, then re-combine.
Upvotes: 1
Reputation: 8683
Try pd.concat
:
pieces = {'a' : df1['a'],
'b' : df1['b'],
'c' : df2}
df3 = pd.concat(pieces, axis=1)
Upvotes: 1