NLTK Concordance not working

Question

I am trying to run the following code for nltk.concordance, but it is not giving any results. Can someone please guide me what am I doing wrong?

import nltk.corpus  
from nltk.text import Text 

sent = '''China is an emerging FinTech hotbed thanks to its expanding middle class, rapid digitization and electronic payments adoption. But a new report from Citi found that, while China may be the market to watch for FinTech investments, the U.S. continues to thrive at the top of the B2B FinTech mountain.
According to Digital Disruption — Revisited: What FinTech VC Investments Tells Us About A Changing Industry, Citi expects an influx in venture capital across the FinTech startup scape. But not all markets are created equal. China saw more than half of the world’s FinTech investments in the first nine months of 2016, the bank noted.'''

content = sent.decode('utf-8') #else it throws error
textList = Text(content)
textList.concordance('FinTech')

I am getting the following output:

No matches

TIA for the help

user2390182 · Accepted Answer

You must create a Text instance from a sequence of strings. Use a Tokenizer from nltk.tokenize to tokenize your sentence:

> t = nltk.tokenize.WhitespaceTokenizer()  # or any other Tokenizer
> c = Text(t.tokenize(content))
> c.concordance(u'FinTech')
Displaying 6 of 6 matches:
                                    FinTech hotbed thanks to its expanding midd
hina may be the market to watch for FinTech investments, the U.S. continues to 
ues to thrive at the top of the B2B FinTech mountain. According to Digital Disr
igital Disruption — Revisited: What FinTech VC Investments Tells Us About A Cha
nflux in venture capital across the FinTech startup scape. But not all markets 
a saw more than half of the world’s FinTech investments in the first nine month

NLTK Concordance not working

Answers (1)

Related Questions