Reputation: 1260
this code in pandas does not work. I want it to delete the row if the column contains any of the text/numbers provided Currently I can only get it to work if the cell matches the exact text being passed in my code .. as in it only deletes cells that say Fin* not Finance or Finly...
df2 = df[df.Team != 'Fin*']
Upvotes: 15
Views: 49607
Reputation: 15622
The * is interpreted as regex. As that's not what you want, you can escape * by \
df2 = df[df.Team != 'Fin\*']
Or simply use
df2 = df[~dfTeam.str.contains('Fin')]
Upvotes: 2
Reputation: 4638
import pandas as pd
df = pd.DataFrame(dict(A=[1,2,3,4], C=["abc","def","abcdef", "lmn"]))
df:
A C
0 1 abc
1 2 def
2 3 abcdef
3 4 lmn
df[df.C.str.contains("abc") == False]
OR as suggested by @RafaelC
df[~df.C.str.contains("abc")]
Output:
A C
1 2 def
3 4 lmn
Upvotes: 11
Reputation: 323376
You can using startswith
df[~df.Team.str.startswith('Fin')]
Or
df[~df.Team.str.contains('Fin')]
Upvotes: 19
Reputation: 57105
You need regular expressions for this operation. Here's a synthetic dataframe:
df = pd.DataFrame({'Team': ['Finance', 'Finally', 'Foo']})
Here's a dataframe that does not (~
) have any Fin's:
df[~df.Team.str.match('Fin*')]
# Team
#2 Foo
If you are sure that a string of interest always starts with Fin, you can use a "softer" method:
df[~df.Team.str.startswith('Fin')]
# Team
#2 Foo
Upvotes: 5