Arthur
Arthur

Reputation: 125

How to convert a dataframe into a pandas IndexSlice to index another MultiIndex dataframe

I have the following 2 dataframes df1 and df2:

import pandas as pd

m_idx = pd.MultiIndex.from_product([range(3), range(3, 6), range(6, 8)])
m_idx.names = ['a', 'b', 'c']
df1 = pd.DataFrame(None, index=m_idx, columns=['x1', 'x2', 'x3'])
df1.loc[:, 'x1'] = m_idx.get_level_values('a') + m_idx.get_level_values('b') + m_idx.get_level_values('c')
df1.loc[:, 'x2'] = df1.loc[:, 'x1'] * 2
df1.loc[:, 'x3'] = df1.loc[:, 'x1'] * 3

df2 = pd.DataFrame({'a': [0, 2], 'c': [6, 6]})

df1:

       x1  x2  x3
a b c            
0 3 6   9  18  27
    7  10  20  30
  4 6  10  20  30
    7  11  22  33
  5 6  11  22  33
    7  12  24  36
1 3 6  10  20  30
    7  11  22  33
  4 6  11  22  33
    7  12  24  36
  5 6  12  24  36
    7  13  26  39
2 3 6  11  22  33
    7  12  24  36
  4 6  12  24  36
    7  13  26  39
  5 6  13  26  39
    7  14  28  42

df2:

   a  c
0  0  6
1  2  6

How can I convert df2 into something I can use to look up the index of df1 where the column names of df2 are the levels and in each row you have the combination of keys you are looking to get out of the df1 index.

Or in other words how can I convert df2 into something that does the equivalent of

df1.loc[pd.IndexSlice[[0, 2], :, [6, 6]], :]

which would return:

       x1  x2  x3
a b c            
0 3 6   9  18  27
  4 6  10  20  30
  5 6  11  22  33
2 3 6  11  22  33
  4 6  12  24  36
  5 6  13  26  39

This is a very simplified and small scale version of what I am actually trying to solve. So really looking to create the pd.IndexSlice on the fly.

I see a separate question that suggested this and I have done something similar in my code BUT it takes too long for my purposes.

df_list = [df1.loc[(v[0], slice(None), v[1]), :] for r, v in df2.iterrows()]
df_sliced = pd.concat(df_list)

So am hoping that using pd.IndexSlice or another alternative instead could be much quicker.

MANY THANKS!

Upvotes: 1

Views: 119

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Convert the index of df1 to dataframe then use isin + all on the matching levels to replicate the behaviour of index slice

d = df2.to_dict('list')
df1[df1.index.to_frame()[d].isin(d).all(1)]

       x1  x2  x3
a b c            
0 3 6   9  18  27
  4 6  10  20  30
  5 6  11  22  33
2 3 6  11  22  33
  4 6  12  24  36
  5 6  13  26  39

Upvotes: 1

Related Questions