Python Pandas Filter Column by Multiple Criteria

Question

I have a dataframe like this:

 A     B        D
 d     151     M
 d     152     a
 d     153     M
 d     161     z
 d     164     M
 d     169     g
 d     178     g
 d     173     e
 d     176     j
 d     605     j
 d     603     d
 d     607     e
 d     905     j
 d     903     d
 d     907     e

I want to filter it based on the values of column B. Specifically i want the values of column B that begins with 15, 16 and 60. My dataframe would be reduced to:

 A     B        D
 d     151     M
 d     152     a
 d     153     M
 d     161     z
 d     164     M
 d     169     g
 d     605     j
 d     603     d
 d     607     e

So far i've been unsuccessful to achieve this result. Which would be the best way to achieve this?

jezrael · Accepted Answer

You need convert to str first and select first 2 values which are compared with isin and filter with boolean indexing:

print (df.B.astype(str).str[:2])
0     15
1     15
2     15
3     16
4     16
5     16
6     17
7     17
8     17
9     60
10    60
11    60
12    90
13    90
14    90
Name: B, dtype: object

print (df[df.B.astype(str).str[:2].isin(['15','16','60'])])
    A    B  D
0   d  151  M
1   d  152  a
2   d  153  M
3   d  161  z
4   d  164  M
5   d  169  g
9   d  605  j
10  d  603  d
11  d  607  e

Another solution is use contains with custom regex - ^ means start of string and | or:

L = [15,16,90]
val = '|'.join(['^' + str(x) for x in L])
print (val)
^15|^16|^90
print (df[df.B.astype(str).str.contains(val)])
    A    B  D
0   d  151  M
1   d  152  a
2   d  153  M
3   d  161  z
4   d  164  M
5   d  169  g
12  d  905  j
13  d  903  d
14  d  907  e

Python Pandas Filter Column by Multiple Criteria

Answers (2)

Related Questions