Reputation: 1042

Filtering values in pandas Dataframe by condition on index column

Given that I have the following pandas DataFrame:

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
          np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']),
          np.array([0.01, 0.2, 0.3, -0.5, 0.6, -0.7, -0.8, 0.9])]

tuples = list(zip(*arrays))
df_index = pd.MultiIndex.from_tuples(tuples, names=['A', 'B', 'measure'])

df = pd.DataFrame(np.random.randn(8, 4), index=df_index)
print(df)

How can I filter all the values where for example the measure column (which is part of the index) is greater than 0.2?

I have tried:

df.loc[:,:,0.1:0.9]

(and other variations of this, but I get the error "IndexingError: Too many indexers"

Thanks, Gerald

Upvotes: 2

Answers (4)

Phik

Reputation: 434

In line with your initial approach, you could use IndexSlice

df.sort_index().loc[pd.IndexSlice[:, :, 0.2:], :]

Upvotes: 0

zipa

Reputation: 27879

This does the trick:

df.iloc[df.index.get_level_values(2) >= 0.2]

Or if you like:

df.iloc[df.index.get_level_values('measure') >= 0.2]

Upvotes: 2

BENY

Reputation: 323316

Something like get_level_values

df[df.index.get_level_values(2)>0.2]
Out[35]: 
                        0         1         2         3
A   B   measure                                        
baz one 0.3     -0.235196  0.183122 -1.620810  0.912996
foo one 0.6     -1.456278 -1.144081 -0.872170  0.547008
qux two 0.9      0.942656 -0.435219 -0.161408 -0.451456

Upvotes: 4

MaxU - stand with Ukraine

Reputation: 210882

In [3]: df.query("measure > 0.2")
Out[3]:
                        0         1         2         3
A   B   measure
baz one 0.3      0.623507  0.602585 -0.792142  2.066095
foo one 0.6      0.138192 -0.159108 -1.796944  1.668463
qux two 0.9     -0.162210 -2.293951  0.602990  1.622783

In [6]: df.loc[pd.IndexSlice[:,:,0.200001:], :]
Out[6]:
                        0         1         2         3
A   B   measure
baz one 0.3      0.623507  0.602585 -0.792142  2.066095
foo one 0.6      0.138192 -0.159108 -1.796944  1.668463
qux two 0.9     -0.162210 -2.293951  0.602990  1.622783

Upvotes: 6

Filtering values in pandas Dataframe by condition on index column

Answers (4)

Related Questions