Pandas: slice Dataframe according to values of a column

Question

I have to slice my Dataframe according to values (imported from a txt) that occur in one of my Dataframe' s column. This is what I have:

>df
col1 col2
 a    1
 b    2
 c    3
 d    4

>'mytxt.txt'
2
3

This is what I need: drop rows whenever value in col2 is not among values in mytxt.txt

Expected result must be:

>df
col1 col2
 b    2
 c    3

I tried:

values = pd.read_csv('mytxt.txt', header=None)
df = df.col2.isin(values)

But it doesn' t work. Help would be very appreciated, thanks!

TayTay · Accepted Answer

When you read values, I would do it as a Series, and then convert it to a set, which will be more efficient for lookups:

values = pd.read_csv('mytxt.txt', header=None, squeeze=True)
values = set(values.tolist())

Then slicing will work:

>>> df[df.col2.isin(values)]
  col1  col2
1    b     2
2    c     3

What was happening is you were reading values in as a DataFrame rather than a Series, so the .isin method was not behaving as you expected.

Answers (1)