shantanuo
shantanuo

Reputation: 32316

Get index of series where value is True

How do I select only True values?

myindex=['a', 'b', 'c' , 'd']
myseries=pd.Series([True, True, False, True], index=myindex)

a     True
b     True
c    False
d     True
dtype: bool

What I have tried:

myseries.where(myseries == True)

This includes "c" while I need to return a list of a, b and d

Upvotes: 6

Views: 10540

Answers (5)

jezrael
jezrael

Reputation: 862661

Filter index values by Series:

print (myseries.index[myseries].tolist())
['a', 'b', 'd']

If performance is important convert both to numpy array and then filter:

print (myseries.index.values[myseries.values].tolist())
['a', 'b', 'd']

Performance:

np.random.seed(456)

myindex=np.random.randint(100, size=10000).astype(str)
myseries=pd.Series(np.random.choice([True, False], size=10000), index=myindex)
print (myseries)

In [7]: %timeit (myseries.index[myseries].tolist())
178 µs ± 5.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [8]: %timeit (myseries.index.values[myseries.values].tolist())
113 µs ± 762 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Another answers:

In [9]: %timeit myseries[myseries].index.tolist()
456 µs ± 28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [10]: %timeit myseries.where(myseries).dropna().index
1.14 ms ± 28.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [11]: %timeit list(myseries[myseries].index)
886 µs ± 54.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [12]: %timeit [i for i,j in myseries.items() if j==True]
2.13 ms ± 8.36 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 4

BENY
BENY

Reputation: 323236

Fix your code

myseries.where(myseries).dropna().index
Index(['a', 'b', 'd'], dtype='object')

Upvotes: 4

Mohit Motwani
Mohit Motwani

Reputation: 4792

If you just want to return the index which is a, b, c, d in your case use the index attribute:

myindex=['a', 'b', 'c' , 'd']
myseries=pd.Series([True, True, False, True], index=myindex)

a     True
b     True
c    False
d     True
dtype: bool

myseries[myseries].index
>> Index(['a', 'b', 'd'], dtype='object')

If you want it as a list:

myseries[myseries].index.tolist()
>> ['a', 'b', 'd']

Upvotes: 12

Sociopath
Sociopath

Reputation: 13401

You can use list-comrehension for it

import pandas as pd 

myindex=['a', 'b', 'c' , 'd']
myseries=pd.Series([True, True, False, True], index=myindex)
vals = [i for i,j in myseries.items() if j==True]
print(vals)

Output:

 ['a', 'b', 'd']

Upvotes: 2

kevins_1
kevins_1

Reputation: 1306

The code myseries[myseries] returns

a    True
b    True
d    True
dtype: bool

If you specifically want a list of ['a', 'b', 'd'] then you can it by list(myseries[myseries].index).

Upvotes: 4

Related Questions