Reputation: 3009
I am wondering the best way to slice a multi-index, using another index, where the other index is a subset of the main multi-index.
np.random.seed(1)
dict_data_russian = {'alpha':[1,2,3,4,5,6,7,8,9],'beta':['a','b','c','d','e','f','g','h','i'],'gamma':['r','s','t','u','v','w','x','y','z'],'value_r': np.random.rand(9)}
dict_data_doll = {'beta':['d','e','f'],'gamma':['u','v','w'],'dont_care': list('PQR')}
df_russian = pd.DataFrame(data=dict_data_russian)
df_russian.set_index(['alpha','beta','gamma'],inplace=True)
df_doll = pd.DataFrame(data=dict_data_doll)
df_doll.set_index(['beta','gamma'],inplace=True)
print df_russian
print df_doll.head()
Which yields:
value_r
alpha beta gamma
1 a r 0.4170
2 b s 0.7203
3 c t 0.0001
4 d u 0.3023
5 e v 0.1468
6 f w 0.0923
7 g x 0.1863
8 h y 0.3456
9 i z 0.3968
dont_care
beta gamma
d u P
e v Q
f w R
How best to use the index in df_doll to slice df_russian, on levels beta & gamma, in order to the following output?
value_r
alpha beta gamma
4 d u 0.3023
5 e v 0.1468
6 f w 0.0923
Upvotes: 0
Views: 46
Reputation: 29
You could strip off the index, join the frames, then add back the index
result = df_doll.reset_index().merge(df_russian.reset_index(), on=['beta', 'gamma'], how='inner')
result.set_index(['alpha', 'beta', 'gamma'], inplace=True)
result.drop('dont_care', 1)
Upvotes: 0
Reputation: 8906
You can do
In [1131]: df_russian[df_russian.reset_index(0).index.isin(df_doll.index)]
Out[1131]:
alpha beta gamma value_r
4 d u 0.302333
5 e v 0.146756
6 f w 0.092339
This uses a boolean key derived by resetting the outer level of the main index and checking if the remaining levels are in the index of df_doll
for each row.
Upvotes: 2