user3556757
user3556757

Reputation: 3619

iloc'ing one level of a multiindex

I have multiindex dataframe, something like:

df = pd.DataFrame(index = pd.MultiIndex.from_product([['mike', 'matt', 'dave', 'frank', 'larry'], range(10)]))
df['foo']="bar"
df.index.names=['people', 'socket']

What I'd like to do is iloc-slice all the rows associated with the first three people in the index. IE: retrieve all the rows where people is either matt mike or dave.

As far as I can tell, though, this is not at all supported by pandas. Saw some gross levels-related hacks, but they didn't even work. get_level_values(0) doesn't give distinct level values, and levels() returns an unsorted frozenset.

edit: I should have said that .loc-based solutions won't work for me.

Upvotes: 1

Views: 1096

Answers (4)

Paul Rougieux
Paul Rougieux

Reputation: 11409

You can also use df.xs()

"This method takes a key argument to select data at a particular level of a MultiIndex."

Reusing your example:

import pandas as pd
df = pd.DataFrame(index = pd.MultiIndex.from_product([['mike', 'matt', 'dave', 'frank', 'larry'], range(10)], names=['people', 'socket']))
df['foo']="bar"
df.index.names=['people', 'socket']


In [60]: df.xs("mike", level="people")
Out[60]:
        foo
socket
0       bar
1       bar
2       bar
3       bar
4       bar
5       bar
6       bar
7       bar
8       bar
9       bar

In [61]: df.xs(7, level="socket")
Out[61]:
        foo
people
mike    bar
matt    bar
dave    bar
frank   bar
larry   bar

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150785

Another option:

df[df.index.get_level_values(0)
     .isin({'matt','mike','dave'})]

Upvotes: 0

Zaraki Kenpachi
Zaraki Kenpachi

Reputation: 5740

Here you go:

df = pd.DataFrame(index = pd.MultiIndex.from_product([['mike', 'matt', 'dave', 'frank', 'larry'], range(10)], names=['people', 'socket']))
df['foo']="bar"
df.index.names=['people', 'socket']
# get rows
select_rows = df.loc[['mike', 'matt', 'dave']]

Output:

people socket     
mike   0       bar
       1       bar
       2       bar
       3       bar
       4       bar
       5       bar
       6       bar
       7       bar
       8       bar
       9       bar
matt   0       bar
       1       bar
       2       bar
       3       bar
       4       bar
       5       bar
       6       bar
       7       bar
       8       bar
       9       bar
dave   0       bar
       1       bar
       2       bar
       3       bar
       4       bar
       5       bar
       6       bar
       7       bar
       8       bar
       9       bar

Upvotes: 2

jezrael
jezrael

Reputation: 863216

One idea is get first uniqe values of first level, indexing and select by loc:

df = df.loc[df.index.get_level_values(0).unique()[:3]]

Detail:

print (df.index.get_level_values(0).unique()[:3])
Index(['mike', 'matt', 'dave'], dtype='object', name='people')

Upvotes: 0

Related Questions