sset
sset

Reputation: 147

NLP text distances

what is best way to calculate distance between words for semantic meaning. For example.. assume we are searching for word "fraud" in documented associated with 2 nouns - "person A" and "person B". Text is something like below. ......"PersonA".....fraud.............."PersonB".........................................................................."fraud" conslusion in "Noun - "PersonA is more likely to be adjective "fraud" since "fraud" is nearer to "PersonA" than "PersonB". Is there any good algorithm/statistical model to measure this for "text mining"

Upvotes: 1

Views: 211

Answers (1)

Nikita Astrakhantsev
Nikita Astrakhantsev

Reputation: 4749

First of all, it seems that the measure you're trying to obtain isn't an ordinary 'semantic meaning' distance, or semantic similarity. It's more likely to be association measure.

So, if you have a lot of occurrences of words to be processed, then look at PMI or other distributional similarities (e.g. 8 week lectures of Natural Language Processing course).

If you have just several occurrences, then I'd suggest to perform syntax parsing and measure ordinary distance in parse tree.

Upvotes: 4

Related Questions