Subset a data frame according to pattern matches in a file.txt pandas

Question

I have a data frame such as

query   subject col1
A   dog ok
B   cat okl
C   cat oklp
D   frog    ok
E   cat ok
F   fox ok

and a file.txt such as:

dog
cat

and the idea is to only keep row that have a pattern present in the file.txt. Here I should get :

query   subject col1
A   dog ok
B   cat okl
C   cat oklp
E   cat ok

I tried with :

file = open('file.txt').read()

df=[]
for row in tab['subject']:
 if row in file: 
   row.append(df)

but it does not seem to be the solution, thank you for your help.

Daniel Labbe · Accepted Answer

Considering that your data frame is called df, this answer reads the file.txt also as a dataframe and merge both data frames, resulting in the desired result - solution similar to an inner join from SQL:

>> df2 = pd.read_csv('file.txt', header= None, names=['subject'])
>> pd.merge(df, df2, on='subject')

    query   subject col1
0   A       dog     ok
1   B       cat     okl
2   C       cat     oklp
3   E       cat     ok

Subset a data frame according to pattern matches in a file.txt pandas

Answers (2)

Related Questions