Reputation: 325
Similar questions were posted here and here, and my question is actually based on what was suggested in answers to those questions.
I try to parse some German texts using Stanford Parser and NLTK.
from nltk.parse.stanford import StanfordParser
import os
os.environ['STANFORD_PARSER'] ='C:\PretestKorpus\stanford-parser-full-2018-10-17'
os.environ['STANFORD_MODELS'] = 'C:\PretestKorpus\stanford-parser-full-2018-10-17'
parser=StanfordParser(model_path="C:\PretestKorpus\germanPCFG.ser.gz")
new=list(parser.raw_parse("Es war einmal ein Bauer"))
Then, of course, I get NLTK was unable to find the java file!
error:
So I set configurations like this:
nltk.internals.config_java('C:\Program Files (x86)\Java\jre1.8.0_251\bin\java.exe')
but it returns
NLTK was unable to find the C:\Program Files (x86)\Java\jre1.8.0_251in\java.exe file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
So, somehow Python reduces the path \\jre1.8.0_251\bin\java.exe
to \\jre1.8.0_251in\java.exe
Looks like this:
Setting environment variable does not help either (returns NLTK was unable to find the java file!
error). Obviously, python does not read the path correctly. But for what reason and how to fix that? Any help will be appreciated.
Upvotes: 0
Views: 125
Reputation: 651
In python \b
inside a String is resolved to a backspace character. Therefore you see the white BS in the picture, becuase the console tries to represent this special character (BS for backspace).
What you need to do is to escape the \ inside your String like so
nltk.internals.config_java('C:\\Program Files (x86)\\Java\\jre1.8.0_251\\bin\\java.exe')
It is a good practice to alway escape all backslash characters, so you can be sure that problems like this one never occur.
Upvotes: 2