Reputation: 21260
I have following DataFrame
:
A B
0 1 5
1 2 3
2 3 2
3 4 0
4 5 1
How I can get by condition values of column A
?
For example all values that great then 3 and less then 6.
Upvotes: 2
Views: 86
Reputation: 42875
You can use boolean indexing
, either with conditions for the endpoints of your interval
df[(df.A > 3) & (df.A < 6)]
or the convenience method .between()
, which behind the scenes translates to the above (and hence is a very very tiny bit slower) where you need to take care that limits are inclusive by default:
df[df.A.between(4, 5)] # uses inclusive limits
to get:
A B
3 4 0
4 5 1
Upvotes: 0
Reputation: 862641
Use between
(is possible use parameter inclusive=False
) with boolean indexing
:
print (df[df.A.between(4,5)])
Sample:
df = pd.DataFrame({'A': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5,5: 6},
'B': {0: 5, 1: 3, 2: 2, 3: 0, 4: 2, 5: 1}})
print (df)
A B
0 1 5
1 2 3
2 3 2
3 4 0
4 5 2
5 6 1
print (df[df.A.between(4,5)]) #default inclusive=True
A B
3 4 0
4 5 2
print (df[df.A.between(3,6, inclusive=False)])
A B
3 4 0
4 5 2
Timings are same:
df = pd.concat([df]*10000).reset_index(drop=True)
In [427]: %timeit (df[df.A.between(3,6, inclusive=False)])
The slowest run took 4.72 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.32 ms per loop
In [428]: %timeit (df[(df.A>3) & (df.A<6)])
1000 loops, best of 3: 1.31 ms per loop
Upvotes: 0