Reputation: 147
what is best way to calculate distance between words for semantic meaning. For example.. assume we are searching for word "fraud" in documented associated with 2 nouns - "person A" and "person B". Text is something like below. ......"PersonA".....fraud.............."PersonB".........................................................................."fraud" conslusion in "Noun - "PersonA is more likely to be adjective "fraud" since "fraud" is nearer to "PersonA" than "PersonB". Is there any good algorithm/statistical model to measure this for "text mining"
Upvotes: 1
Views: 211
Reputation: 4749
First of all, it seems that the measure you're trying to obtain isn't an ordinary 'semantic meaning' distance, or semantic similarity. It's more likely to be association measure.
So, if you have a lot of occurrences of words to be processed, then look at PMI or other distributional similarities (e.g. 8 week lectures of Natural Language Processing course).
If you have just several occurrences, then I'd suggest to perform syntax parsing and measure ordinary distance in parse tree.
Upvotes: 4