khushbu
khushbu

Reputation: 579

How to check if a dataframe contains a string in python?

My report.txt contains

I am a student from Tribhuwan university

whereas my dictionary.txt contains this. enter image description here

I am trying to check if the words in the dictionary contain the words in the report. However, using the code below I get the result v as True for only the first element. Whereas, in my opinion, v should be true for all cases as all the words in the dictionary are contained in the report. Please help me figure out what am I doing wrong.

My dictionary.txt contains this

words, synonyms
I, me
student, pupil
tribhuwan,
university, school

enter image description here

import pandas as pd

report = pd.read_csv("report.txt", header=None)
dict = pd.read_csv("dictionary.txt")

for report in report[0]:
v = []
for word in dict['words']:
    if word in report[0]:
        v.append(True)
    else:
        v.append(False)

Upvotes: 1

Views: 166

Answers (1)

jpp
jpp

Reputation: 164843

You don't need an explicit for loop here. With Pandas, you can add an extra column to a dataframe and use pd.Series.isin. You probably also need to make all letters lowercase for comparison. Finally, never shadow a built-in, i.e. don't use dict for a variable name.

Here's a demo:

from io import StringIO

report = StringIO("""I am a student from Tribhuwan university
""")

dictionary = StringIO("""words, synonyms
I, me
student, pupil
tribhuwan,
university, school""")

df_report = pd.read_csv(report, header=None)
df_dict = pd.read_csv(dictionary)

words = df_report[0].str.lower().iat[0].split()

df_dict['check'] = df_dict['words'].str.lower().isin(words)

print(df_dict)

        words  synonyms  check
0           I        me   True
1     student     pupil   True
2   tribhuwan       NaN   True
3  university    school   True

Upvotes: 3

Related Questions