Reputation: 5
I have an Excel sheet where it has 2 columns. The 1st is ingredient
and the 2nd is tag
and I have a sentence. I want to compare the value of ingredient column to the string. If the word is matched, then add it into new list. For this purpose my code is here:
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
import pandas as pd
setence="I like carrot Apple wine"
word =word_tokenize(setence.lower())
l=[]
data=pd.read_excel('items.xlsx')
print(data["ingredients"])
for item in data["ingredients"]:
if item in word:
print(item)
l.append(item)
print(l)
The Excel sheet contains these values:
0 apple
1 wine
2 carrot
3 egg
4 CUP
The output of the code is:
['apple', 'carrot']
But it didn't match wine?
Upvotes: 0
Views: 380
Reputation: 1083
When I try your code with the same sample data, I recognize that 'wine '
in data['ingredients']
has whitespace, while 'wine'
in word
doesn't have.
My suggestion to avoid this issue is using strip()
: if item.strip() in word:
Code:
#...
for item in data["ingredients"]:
if item.strip() in word:
print(item)
#...
Output:
apple
wine
carrot
Upvotes: 1