Dickster
Dickster

Reputation: 3009

MultiIndex Slice | slice with another index across a subset of levels

I am wondering the best way to slice a multi-index, using another index, where the other index is a subset of the main multi-index.

np.random.seed(1)
dict_data_russian = {'alpha':[1,2,3,4,5,6,7,8,9],'beta':['a','b','c','d','e','f','g','h','i'],'gamma':['r','s','t','u','v','w','x','y','z'],'value_r': np.random.rand(9)}
dict_data_doll = {'beta':['d','e','f'],'gamma':['u','v','w'],'dont_care': list('PQR')}
df_russian = pd.DataFrame(data=dict_data_russian)
df_russian.set_index(['alpha','beta','gamma'],inplace=True)
df_doll = pd.DataFrame(data=dict_data_doll)
df_doll.set_index(['beta','gamma'],inplace=True)

print df_russian
print df_doll.head()

Which yields:

                  value_r
alpha beta gamma         
1     a    r       0.4170
2     b    s       0.7203
3     c    t       0.0001
4     d    u       0.3023
5     e    v       0.1468
6     f    w       0.0923
7     g    x       0.1863
8     h    y       0.3456
9     i    z       0.3968


           dont_care
beta gamma          
d    u             P
e    v             Q
f    w             R

How best to use the index in df_doll to slice df_russian, on levels beta & gamma, in order to the following output?

                   value_r
alpha beta gamma         
4     d    u       0.3023
5     e    v       0.1468
6     f    w       0.0923

Upvotes: 0

Views: 46

Answers (2)

Grant Cavanaugh
Grant Cavanaugh

Reputation: 29

You could strip off the index, join the frames, then add back the index

result = df_doll.reset_index().merge(df_russian.reset_index(), on=['beta', 'gamma'], how='inner')
result.set_index(['alpha', 'beta', 'gamma'], inplace=True)
result.drop('dont_care', 1)

Upvotes: 0

JoeCondron
JoeCondron

Reputation: 8906

You can do

In [1131]:    df_russian[df_russian.reset_index(0).index.isin(df_doll.index)]
Out[1131]:

alpha   beta    gamma   value_r
    4   d       u   0.302333
    5   e       v   0.146756
    6   f       w   0.092339

This uses a boolean key derived by resetting the outer level of the main index and checking if the remaining levels are in the index of df_doll for each row.

Upvotes: 2

Related Questions