Parse a File To Get Set Of Words - May Be NLP Related?

Question

I want to parse a file that is between 300-2,000 words and create lists of words in groups of 1 to n words long. For example, if I had this file:

 The fat cat sat on a mat.

The output for 1-2 would be:

# group of words, 1 word length
['The' 'fat' 'cat' 'sat' 'on' 'a' 'mat'] 

 # group of words, 2 word length
 [['The', 'fat'], ['fat'], 'cat', ['cat', 'sat'], ['sat', 'on'], ['on', 'a'], ['a', 'mat']]

I'm sure I could write some very inefficient code that could do this but I'm wondering if there is an NLP (or other) library that can do this for me.

Parse a File To Get Set Of Words - May Be NLP Related?

Answers (1)

Related Questions