Reputation: 591
As training data, have reviews of restaurants in XML, with associated target expression a sentiment is being expressed toward, a category which is a discrete label this belongs to and the polarity expressed toward this:
<text>With the great variety on the menu , I eat here often and never get bored .</text>
<Opinions>
<Opinion target="menu" category="FOOD#STYLE_OPTIONS" polarity="positive" from="30" to="34"/>
</Opinions>
I have used the TextBlob NB classifier to train targets terms to associated categories.
For test data, my aim is to predict the target expression, given a sentence and the category. I have first extracted nouns and noun phrases from the sentence, assuming the expression will be a subset of these. For the sentence:
"what may be interesting to most is the worst sevice attitude come from the owner of this establishment
", these are ['sevice attitude', 'owner', 'establishment']
.
I would like to know which of these is most likely given the category, which in this case is SERVICE#GENERAL
. How could I go about this?
Upvotes: 1
Views: 409
Reputation: 964
TextBlob's NB classifier by default extracts the text features as a bag of words. So you can simply concatenate the words in the list of extracted nouns and then concatenate it with the category to use the result as the training text. And use the target as the training label.
Considering the bag of words treat words independently, you should tranform these noun phrases in just one word. You can put a '-' instead of space, for example ('sevice attitude' would be 'sevice-attitude').
Example:
from textblob.classifiers import NaiveBayesClassifier
train = [('sevice-attitude owner establishment SERVICE#GENERAL', 'owner'),
('menu variety FOOD#STYLE_OPTIONS', 'menu')]
cl = NaiveBayesClassifier(train)
If you want you can customize the feature extraction: https://textblob.readthedocs.io/en/dev/classifiers.html#feature-extractors
Upvotes: 0