Reputation: 599
I'm using the NLTK to find word in a text. I need to save result of concordance function into a list. The question is already asked here but i cannot see the changes. I try to find the type of returnde value of the function by :
type(text.concordance('myword'))
the result was :
<class 'NoneType'>
Upvotes: 2
Views: 3404
Reputation: 4371
The Text class now has a concordance_list
function. For example:
from nltk.corpus import gutenberg
from nltk.text import Text
corpus = gutenberg.words('melville-moby_dick.txt')
text = Text(corpus)
con_list = text.concordance_list("monstrous")
Upvotes: 3
Reputation: 2571
By inspecting the source of ConcordanceIndex
, we can see that results are printed to stdout. If redirecting stdout to a file is not an option, you have to reimplement the ConcordanceIndex.print_concordance
such that it returns the results rather than printing it to stdout.
Code:
def concordance(ci, word, width=75, lines=25):
"""
Rewrite of nltk.text.ConcordanceIndex.print_concordance that returns results
instead of printing them.
See:
http://www.nltk.org/api/nltk.html#nltk.text.ConcordanceIndex.print_concordance
"""
half_width = (width - len(word) - 2) // 2
context = width // 4 # approx number of words of context
results = []
offsets = ci.offsets(word)
if offsets:
lines = min(lines, len(offsets))
for i in offsets:
if lines <= 0:
break
left = (' ' * half_width +
' '.join(ci._tokens[i-context:i]))
right = ' '.join(ci._tokens[i+1:i+context])
left = left[-half_width:]
right = right[:half_width]
results.append('%s %s %s' % (left, ci._tokens[i], right))
lines -= 1
return results
Usage:
from nltk.book import text1
from nltk.text import ConcordanceIndex
ci = ConcordanceIndex(text1.tokens)
results = concordance(ci, 'circumstances')
print(type(results))
<class 'list'>
Upvotes: 3
Reputation: 308
To use text concordance, you need to instantiate a NLTK Text()
object and then use concordance()
method on that object :
import nltk.corpus
from nltk.text import Text
moby = Text(nltk.corpus.gutenberg.words('melville-moby_dick.txt'))
Here we instantiate a Text object on the text file melville-moby_dick.txt
and then we are able to use the method :
moby.concordance("monster")
If you have a NonType here, it seeems to be because you did not created any Text
object and so your variable text
is None
.
Upvotes: 0