Reputation: 946
Say you have a multiindex DataFrame
x y z
a 1 0 1 2
2 3 4 5
b 1 0 1 2
2 3 4 5
3 6 7 8
c 1 0 1 2
2 0 4 6
Now you have another DataFrame which is
col1 col2
0 a 1
1 b 1
2 b 3
3 c 1
4 c 2
How do you split the multiindex DataFrame based on the one above?
Upvotes: 2
Views: 423
Reputation: 323276
Simply using isin
df[df.index.isin(list(zip(df2['col1'],df2['col2'])))]
Out[342]:
0 1 2 3
index1 index2
a 1 1 0 1 2
b 1 1 0 1 2
3 3 6 7 8
c 1 1 0 1 2
2 2 0 4 6
Upvotes: 1
Reputation: 333
You can also do this using the MultiIndex reindex method https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html
## Recreate your dataframes
tuples = [('a', 1), ('a', 2),
('b', 1), ('b', 2),
('b', 3), ('c', 1),
('c', 2)]
data = [[1, 0, 1, 2],
[2, 3, 4, 5],
[1, 0, 1, 2],
[2, 3, 4, 5],
[3, 6, 7, 8],
[1, 0, 1, 2],
[2, 0, 4, 6]]
idx = pd.MultiIndex.from_tuples(tuples, names=['index1','index2'])
df= pd.DataFrame(data=data, index=idx)
df2 = pd.DataFrame([['a', 1],
['b', 1],
['b', 3],
['c', 1],
['c', 2]])
# Answer Question
idx_subset = pd.MultiIndex.from_tuples([(a, b) for a, b in df2.values], names=['index1', 'index2'])
out = df.reindex(idx_subset)
print(out)
0 1 2 3
index1 index2
a 1 1 0 1 2
b 1 1 0 1 2
3 3 6 7 8
c 1 1 0 1 2
2 2 0 4 6
Upvotes: 0
Reputation: 862741
Use loc
by tuple
s:
df = df1.loc[df2.set_index(['col1','col2']).index.tolist()]
print (df)
x y z
a 1 0 1 2
b 1 0 1 2
3 6 7 8
c 1 0 1 2
2 0 4 6
df = df1.loc[[tuple(x) for x in df2.values.tolist()]]
print (df)
x y z
a 1 0 1 2
b 1 0 1 2
3 6 7 8
c 1 0 1 2
2 0 4 6
Or join
:
df = df2.join(df1, on=['col1','col2']).set_index(['col1','col2'])
print (df)
x y z
col1 col2
a 1 0 1 2
b 1 0 1 2
3 6 7 8
c 1 0 1 2
2 0 4 6
Upvotes: 2