Colonel Beauvel
Colonel Beauvel

Reputation: 31161

Remove lines in dataframe using a list in Pandas

It's a generic question about filtering a pandas dataframe using a list. The problem is the following:

For the moment, to retrieve the dataframe without the banned field, I proceed as follows:

for f in ban_field:
    df = df[df.field!=f]

Is there a more pythonic way to proceed (in one line?)?

Upvotes: 5

Views: 3329

Answers (2)

DSM
DSM

Reputation: 352959

Method #1: use isin and a boolean array selector:

In [47]: df = pd.DataFrame({"a": [2]*10, "field": range(10)})

In [48]: ban_field = [3,4,6,7,8]

In [49]: df[~df.field.isin(ban_field)]
Out[49]: 
   a  field
0  2      0
1  2      1
2  2      2
5  2      5
9  2      9

[5 rows x 2 columns]

Method #2: use query:

In [51]: df.query("field not in @ban_field")
Out[51]: 
   a  field
0  2      0
1  2      1
2  2      2
5  2      5
9  2      9

[5 rows x 2 columns]

Upvotes: 9

Shashank Agarwal
Shashank Agarwal

Reputation: 2804

You can remove it by using the isin function and the negation (~) operator.

df[~df.field.isin(ban_field)]

Upvotes: 1

Related Questions