Reputation: 1934
I have two dataframes looking like
import pandas as pd
df1 = pd.DataFrame([2.1,4.2,6.3,8.4,10.5], index=[2,4,6,8,10])
df1.index.name = 't'
df2 = pd.DataFrame(index=pd.MultiIndex.from_tuples([('A','a',1),('A','a',4),
('A','b',5),('A','b',6),('B','c',7),
('B','c',9),('B','d',10),('B','d',11),
], names=('big', 'small', 't')))
I am searching for an efficient way to combine them such that I get
0
big small t
A a 1 NaN
2 2.1
4 4.2
b 5 NaN
6 6.3
B c 7 NaN
8 8.4
9 NaN
d 10 10.5
11 NaN
I.e. I want to get the index levels 0
and 1
of df2
as index levels 0
and 1
in df1
.
Of course a loop over the dataframe would work as well, though not feasible for large dataframes.
EDIT:
It appears from comments below that I should add, the indices big
and small
should be inferred on t
in df1
based on the ordering of t
.
Upvotes: 0
Views: 63
Reputation: 59529
Assuming that you want the unknown index levels to be inferred based on the ordering of 't'
, we can use an other merge, sort the values and then re-create the MultiIndex using ffill
logic (need a Series for this).
res = (df2.reset_index()
.merge(df1, on='t', how='outer')
.set_index(df2.index.names)
.sort_index(level='t'))
res.index = pd.MultiIndex.from_arrays(
[pd.Series(res.index.get_level_values(i)).ffill()
for i in range(res.index.nlevels)],
names=res.index.names)
print(res)
0
big small t
A a 1 NaN
2 2.1
4 4.2
b 5 NaN
6 6.3
B c 7 NaN
8 8.4
9 NaN
d 10 10.5
11 NaN
Upvotes: 1
Reputation: 150735
Try extracting the level values and reindex
:
df2['0'] = df1.reindex(df2.index.get_level_values('t'))[0].values
Output:
0
big small t
A a 1 NaN
4 4.2
b 5 NaN
6 6.3
B c 7 NaN
9 NaN
d 10 10.5
11 NaN
For more columns in df1
, we can just merge
:
(df2.reset_index()
.merge(df1, on='t', how='left')
.set_index(df2.index.names)
)
Upvotes: 1