Reputation: 989
NLTK version 3.4.5. Python 3.7.4. OSX version 10.14.5.
Upgrading the codebase from 2.7, started running into this issue just now. I've done a fresh no-cache reinstall of all packages and extensions, in a fresh virtualenv. Pretty mystified as to how this could be happening to only me and I can't find anyone else having the same error online.
(venv3) gmoss$ python
Python 3.7.4 (default, Sep 7 2019, 18:27:02)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/__init__.py", line 150, in <module>
from nltk.translate import *
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/translate/__init__.py", line 23, in <module>
from nltk.translate.meteor_score import meteor_score as meteor
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/translate/meteor_score.py", line 10, in <module>
from nltk.stem.porter import PorterStemmer
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/__init__.py", line 29, in <module>
from nltk.stem.snowball import SnowballStemmer
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/snowball.py", line 314, in <module>
class ArabicStemmer(_StandardStemmer):
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/site-packages/nltk/stem/snowball.py", line 326, in ArabicStemmer
r'[\u064b-\u064c-\u064d-\u064e-\u064f-\u0650-\u0651-\u0652]'
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/re.py", line 234, in compile
return _compile(pattern, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/re.py", line 286, in _compile
p = sre_compile.compile(pattern, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 930, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 426, in _parse_sub
not nested and not items))
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 536, in _parse
code1 = _class_escape(source, this)
File "/Users/gmoss/Documents/constructor/autocomplete/venv3/lib/python3.7/sre_parse.py", line 337, in _class_escape
raise source.error('bad escape %s' % escape, len(escape))
re.error: bad escape \u at position 1
Upvotes: 8
Views: 181
Reputation: 989
To follow-up, this was a false alarm: an errant cleanup script was deleting NLTK's shared object file inside my virtual environment and I guess it was falling back to some other version.
Upvotes: 0
Reputation: 989
In case anyone else runs in to this, downgrading to 3.4.2 fixes the issue, as this is before the introduction of ArabicStemmer into the relevant file. I’ve opened an issue with nltk and hopefully it gets resolved.
Upvotes: 0
Reputation: 41625
The Python regular expressions dont support the \u
escape, as the error message says.
It's strange though that the error comes from the nltk
package. The authors of that package know for sure how to write regular expressions. Did you accidentally pick up the Python 2.7 version of the nltk
package, even though it Kaminstaller in your 3.7 directory?
I expect that the nltk
package has unit tests for all its code. I'd file a bug report against that package.
Upvotes: 1