Reputation: 181
I am working on a Sentiment Analysis Tool using SentiWordNet and Apache NLP library. The problem is when I tag the sentence using NLP Library I get the result such as,
Test_NNP Tweet_NNP is_VBZ ready_JJ now_RB for_IN the_DT change._NN
but the sentiWordNet has POS
Tags like a
, v
, n...etc
how do I convert NNP
, VBZ
, JJ
to n
or v
or a
with Apache NLP?
Should I use a different library for tagging instead?
Upvotes: 2
Views: 982
Reputation: 621
The tags you are getting from Apache NLP are Penn Treebank tags, you have to convert the tags to SentiWordNet compatible tags. The following function would map the treebank tags to WordNet part of speech names:
def get_wordnet_pos(treebank_tag):
if treebank_tag.startswith('J'):
return 'a'
elif treebank_tag.startswith('V'):
return 'v'
elif treebank_tag.startswith('N'):
return 'n'
elif treebank_tag.startswith('R'):
return 'r'
else:
return ''
Upvotes: 1