Reputation: 71
I have to identify cities in a document (has only characters), I do not want to maintain an entire vocabulary as it is not a practical solution. I also do not have Azure text analytics api account.
I have already tried using Spacy, I did ner and identified geolocation and that output is passed to spellchecker() to train the model. But the issue with this is that ner requires sentences and my input has words.
I am relatively new to this field.
Upvotes: 3
Views: 704
Reputation: 6639
You can check out the geotext library.
text = "The capital of Belarus is Minsk. Minsk is not so far away from Kiev or Moscow. Russians and Belarussians are nice people."
from geotext import GeoText
places = GeoText(text)
print(places.cities)
Output:
['Minsk', 'Minsk', 'Kiev', 'Moscow']
wordList = ['London', 'cricket', 'biryani', 'Vilnius', 'Delhi']
for i in range(len(wordList)):
places = GeoText(wordList[i])
if places.cities:
print(places.cities)
Output:
['London']
['Vilnius']
['Delhi']
Upvotes: 4