Reputation: 382
Relatively new to NLP and working on tagging sentences that contain foreign words using NLTK's PerceptronTagger (in Python) - but it continues to tag the tokenized foreign word by position in the syntax rather than as a 'FW'.
Does the whole sentence have to be in the language (with the appropriate language pickle file loaded) for the 'FW' tag to work ala the NLTK documentation? Is there a way of sensing a foreign word within an English sentence?
On the flip side of that coin, do sentences containing foreign words that have been normalized into the English language tag as English? (ie: entrepreneur, siesta, zeitgeist, etc)
Upvotes: 3
Views: 428
Reputation: 428
in Spacy it means "Foreign Word". Maybe it is the same in NLTK.
Upvotes: 0