Reputation: 734
Is there any algorithm or way you could think of to determine the least important word to the meaning of a sentence? More generally, is there any way to assign some number to each word based on its importance in a sentence? By "importance" I mean that if you were to remove this word from the sentence it would have little effect to the meaning (low importance) or a large effect to the meaning (high importance).
Upvotes: 0
Views: 588
Reputation: 9081
This is a very vague question. From what I understand, you want to do something like keyword extraction.
POS Tagging is a good start. It lets you tag sentences to their parts of speech (Nouns, verbs adjectives etc) - POS Tag NLTK. You can then write your own rules to extract just the parts of speech that interest you.
Stopword Removal is another option
Keyword Extraction does a bunch of stuff you can read with examples -
chunking
chinking
named entity recognition
Building CFGs and parse trees
Relation Extraction
I think reading this chapter will give the perspective and the code snippets to get you started.
Upvotes: 2