Reputation: 579
My report.txt contains
I am a student from Tribhuwan university
whereas my dictionary.txt contains this.
I am trying to check if the words in the dictionary contain the words in the report. However, using the code below I get the result v as True for only the first element. Whereas, in my opinion, v should be true for all cases as all the words in the dictionary are contained in the report. Please help me figure out what am I doing wrong.
My dictionary.txt
contains this
words, synonyms
I, me
student, pupil
tribhuwan,
university, school
import pandas as pd
report = pd.read_csv("report.txt", header=None)
dict = pd.read_csv("dictionary.txt")
for report in report[0]:
v = []
for word in dict['words']:
if word in report[0]:
v.append(True)
else:
v.append(False)
Upvotes: 1
Views: 166
Reputation: 164843
You don't need an explicit for
loop here. With Pandas, you can add an extra column to a dataframe and use pd.Series.isin
. You probably also need to make all letters lowercase for comparison. Finally, never shadow a built-in, i.e. don't use dict
for a variable name.
Here's a demo:
from io import StringIO
report = StringIO("""I am a student from Tribhuwan university
""")
dictionary = StringIO("""words, synonyms
I, me
student, pupil
tribhuwan,
university, school""")
df_report = pd.read_csv(report, header=None)
df_dict = pd.read_csv(dictionary)
words = df_report[0].str.lower().iat[0].split()
df_dict['check'] = df_dict['words'].str.lower().isin(words)
print(df_dict)
words synonyms check
0 I me True
1 student pupil True
2 tribhuwan NaN True
3 university school True
Upvotes: 3