Reputation: 159
I'm new with Python and NLTK When I test the following lines in the Python console
import nltk.data
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text ="toto. titi. tutu"
tokens = tokenizer.tokenize(text)
print(tokens)
I get what I expect. But when I execute these lines from a file, for example with the command line > python tokenize.py
, I get errors:
C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\python.exe C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py
Traceback (most recent call last):
File "C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py", line 1, in <module>
import nltk.data
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\__init__.py", line 89, in <module>
from nltk.internals import config_java
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\internals.py", line 11, in <module>
import subprocess
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\subprocess.py", line 395, in <module>
import threading
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\threading.py", line 10, in <module>
from traceback import format_exc as _format_exc
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\traceback.py", line 3, in <module>
import linecache
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\linecache.py", line 10, in <module>
import tokenize
File "C:\Documents\Dvpt\SemanticAndOpenData\scholar\scholar.py\tokenize.py", line 2, in <module>
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\data.py", line 786, in load
resource_val = pickle.load(opened_resource)
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\__init__.py", line 63, in <module>
from nltk.tokenize.simple import (SpaceTokenizer, TabTokenizer, LineTokenizer,
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\simple.py", line 38, in <module>
from nltk.tokenize.api import TokenizerI, StringTokenizer
File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\api.py", line 13, in <module>
from nltk.internals import overridden
ImportError: cannot import name 'overridden'
Process finished with exit code 1
And I'm stuck on the problem and I can't find a way to solve it. Thanks in advance for any useful proposal.
Upvotes: 1
Views: 558
Reputation: 130
The problem here is that you have named your script as tokenize.py. Try renaming the file to something like my_tokenizer.py. Actually what is happening is that when you are using
import tokenize
What it is doing is trying to import the current file itself and thus you are getting the errors.
Upvotes: 1