Anindita Bhowmik
Anindita Bhowmik

Reputation: 874

Unable to import city_database dataset from NLTK data in Anaconda Spyder Windows Platform

I have downloaded the NLTK Data Sets using nltk.download() into a specific folder path in the D drive say D:\ABC\xyz. I have set this path in the environment variable 'NLTK_DATA'. I have included this path in the python program using nltk.data.path.append("D:\ABC\xyz")

Now when I include the statement: from nltk.corpus import city_database I get the error 'cannot import name 'city_database''

This dataset resides in the path:D:\ABC\xyz\nltk_data\corpora\city_database but I am unable to find the right syntax to import this dataset.

Upvotes: 1

Views: 902

Answers (1)

jaboja
jaboja

Reputation: 2237

The city_database is not a corpus but a database for module nltk.sem.chat80.

You can access it by calling nltk.sem.chat80.sql_demo(). It is just a demo so it prins the database instead of returning it as a list. If you need to do anything more sophisticated look at the docs at

http://www.nltk.org/api/nltk.sem.html?highlight=chat80#module-nltk.sem.chat80

For instance to get it as a dict you can query the database this way:

capitals = {
    country:city for city, country in nltk.sem.chat80.sql_query(
        "corpora/city_database/city.db",
        "SELECT City, Country FROM city_table"
    )
}

But in fact it is rather example of use of the library than usefull dataset, as it is small and outdated (for instance Moscow is listed as capital of Soviet Union).

Upvotes: 2

Related Questions