0004
0004

Reputation: 1260

Pandas: Delete Row if cell contains specific text

this code in pandas does not work. I want it to delete the row if the column contains any of the text/numbers provided Currently I can only get it to work if the cell matches the exact text being passed in my code .. as in it only deletes cells that say Fin* not Finance or Finly...

df2 = df[df.Team != 'Fin*']

Upvotes: 15

Views: 49607

Answers (4)

Tiago Peres
Tiago Peres

Reputation: 15622

The * is interpreted as regex. As that's not what you want, you can escape * by \

df2 = df[df.Team != 'Fin\*']

Or simply use

df2 = df[~dfTeam.str.contains('Fin')]

Upvotes: 2

min2bro
min2bro

Reputation: 4638

import pandas as pd
df = pd.DataFrame(dict(A=[1,2,3,4], C=["abc","def","abcdef", "lmn"]))

df:

    A   C
0   1   abc
1   2   def
2   3   abcdef
3   4   lmn

df[df.C.str.contains("abc") == False]

OR as suggested by @RafaelC

df[~df.C.str.contains("abc")]

Output:

    A   C
1   2   def
3   4   lmn

Upvotes: 11

BENY
BENY

Reputation: 323376

You can using startswith

df[~df.Team.str.startswith('Fin')]

Or

df[~df.Team.str.contains('Fin')]

Upvotes: 19

DYZ
DYZ

Reputation: 57105

You need regular expressions for this operation. Here's a synthetic dataframe:

df = pd.DataFrame({'Team': ['Finance', 'Finally', 'Foo']})

Here's a dataframe that does not (~) have any Fin's:

df[~df.Team.str.match('Fin*')]
#  Team
#2  Foo

If you are sure that a string of interest always starts with Fin, you can use a "softer" method:

df[~df.Team.str.startswith('Fin')]
#  Team
#2  Foo

Upvotes: 5

Related Questions