jmm
jmm

Reputation: 23

Conflicting default and custom paths for nltk_data

I understand that there should be two different directories for nltk_data: one for the default download and another one for the user's custom files.

In my macOS setup I have manually checked that all the default data packages are in /usr/local/share/nltk_data, and that is the outcome of next(p for p in nltk.data.path if os.path.exists(p)).

However, when I try to download another default package it doesn't go to that directory, but to /Users/macbook/nltk_data, where I understant that only my custom files should be. And testing the instalation for the default nltk.corpus.brown.words() fails because it looks for it in my custom path: 'No such file or directory: '/Users/macbook/nltk_data/corpora/brown/ca01'

I am using Python 3.6.3, conda 4.4.8, and the outcome of print(nltk.data.path)is

['/Users/macbook/nltk_data', '/usr/share/nltk_data', '/usr/local/share/nltk_data', '/usr/lib/nltk_data', '/usr/local/lib/nltk_data', '/Users/macbook/anaconda3/nltk_data', '/Users/macbook/anaconda3/lib/nltk_data', '/usr/local/share/nltk_data']

Upvotes: 2

Views: 1023

Answers (1)

scripter
scripter

Reputation: 356

You can download any package like this:

nltk.download('treebank', download_dir='/home/username/data/treebank')

And you can tell nltk to look in a custom directory with this line:

nltk.data.path.append("path_to_custom_directory ")

Upvotes: 3

Related Questions