Benben
Benben

Reputation: 1455

How to detect compound words / multiple words as one term.

Currently I am trying to detect nouns from texts. I would like to compound words / multiword expressions as one term. For example, I would like to detect "stock market" as one term, rather than "stock" and "market."

If you know any tools, related papers and so on, please let me know.

Upvotes: 2

Views: 2925

Answers (1)

Pierre
Pierre

Reputation: 1246

You are interested in collocations. Hypothesis testing is a good way to start, plus it will give you nice insights from a statistical point of view.

Just follow the recipe here: http://nlp.stanford.edu/fsnlp/promo/colloc.pdf

There are also, rule-based, symbolic approaches, you should find easily by yourself.

Good luck.

Upvotes: 3

Related Questions