Jean-Claude Moissinac
Jean-Claude Moissinac

Reputation: 159

ImportError in a simple NLTK example

I'm new with Python and NLTK When I test the following lines in the Python console

import nltk.data
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text ="toto. titi. tutu"
tokens = tokenizer.tokenize(text)
print(tokens)

I get what I expect. But when I execute these lines from a file, for example with the command line > python tokenize.py, I get errors:

C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\python.exe C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py
Traceback (most recent call last):
  File "C:/Documents/Dvpt/SemanticAndOpenData/scholar/scholar.py/tokenize.py", line 1, in <module>
    import nltk.data
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\__init__.py", line 89, in <module>
    from nltk.internals import config_java
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\internals.py", line 11, in <module>
    import subprocess
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\subprocess.py", line 395, in <module>
    import threading
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\threading.py", line 10, in <module>
    from traceback import format_exc as _format_exc
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\traceback.py", line 3, in <module>
    import linecache
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\linecache.py", line 10, in <module>
    import tokenize
  File "C:\Documents\Dvpt\SemanticAndOpenData\scholar\scholar.py\tokenize.py", line 2, in <module>
    tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\data.py", line 786, in load
    resource_val = pickle.load(opened_resource)
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\__init__.py", line 63, in <module>
    from nltk.tokenize.simple   import (SpaceTokenizer, TabTokenizer, LineTokenizer,
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\simple.py", line 38, in <module>
    from nltk.tokenize.api import TokenizerI, StringTokenizer
  File "C:\outils\Python\WinPython-64bit-3.4.3.5\python-3.4.3.amd64\lib\site-packages\nltk\tokenize\api.py", line 13, in <module>
    from nltk.internals import overridden
ImportError: cannot import name 'overridden'
Process finished with exit code 1

And I'm stuck on the problem and I can't find a way to solve it. Thanks in advance for any useful proposal.

Upvotes: 1

Views: 558

Answers (2)

Ankit
Ankit

Reputation: 130

The problem here is that you have named your script as tokenize.py. Try renaming the file to something like my_tokenizer.py. Actually what is happening is that when you are using

import tokenize

What it is doing is trying to import the current file itself and thus you are getting the errors.

Upvotes: 1

Andrew Coover
Andrew Coover

Reputation: 41

You need to name the script something other than tokenize.py

Upvotes: 4

Related Questions