Reputation: 11
I'm trying to load english.pickle for sentence tokenization. Windows 7, Python 3.4
File followed by the path exists(tokenizers/punkt/PY3/english.pickle).
Here is the code:
import nltk.data tokenizer = nltk.data.load('tokenizers/punkt/PY3/english.pickle')
Here is the error:
OSError: No such file or directory: 'C:\\Python\\nltk_data\\tokenizers\\punkt\\PY3\\PY3\\english.pickle'
How to fix?
Upvotes: 1
Views: 1822
Reputation: 1677
The problem is that \\PY3
is doubled in your path.
The nltk.data.load()
method adds /PY3
to the path if it is called from python 3.
So it should work if you simply load the tokenizer with (removing /PY3
from the string):
import nltk
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
NLTK does that to allow for the possibility of programs that could be run with python 2 and 3.
Upvotes: 5