Reputation: 302
I am trying to run the following lines of code:
import os
os.environ['JAVAHOME'] = 'path/to/java.exe'
os.environ['STANFORD_PARSER'] = 'path/to/stanford-parser.jar'
os.environ['STANFORD_MODELS'] = 'path/to/stanford-parser-3.8.0-models.jar'
from nltk.parse.stanford import StanfordDependencyParser
dep_parser = StanfordDependencyParser(model_path="path/to/englishPCFG.ser.gz")
sentence = "sample sentence ..."
# Dependency Parsing:
print("Dependency Parsing:")
print([parse.tree() for parse in dep_parser.raw_parse(sentence)])
and at the line:
print([parse.tree() for parse in dep_parser.raw_parse(sentence)])
I get the following issues:
Traceback (most recent call last): File "C:/Users/Norbert/PycharmProjects/untitled/StanfordDependencyParser.py", line 21, in print([parse.tree() for parse in dep_parser.raw_parse(sentence)]) File "C:\Users\Norbert\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\parse\stanford.py", line 134, in raw_parse return next(self.raw_parse_sents([sentence], verbose)) File "C:\Users\Norbert\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\parse\stanford.py", line 152, in raw_parse_sents return self._parse_trees_output(self._execute(cmd, '\n'.join(sentences), verbose)) File "C:\Users\Norbert\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\parse\stanford.py", line 218, in _execute stdout=PIPE, stderr=PIPE) File "C:\Users\Norbert\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\internals.py", line 135, in java print(_decode_stdoutdata(stderr)) File "C:\Users\Norbert\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\internals.py", line 737, in _decode_stdoutdata return stdoutdata.decode(encoding) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xac in position 3097: invalid start byte
Any idea what could be wrong ? I am not even dealing with any non-utf-8 text.
Upvotes: 1
Views: 621
Reputation: 654
I can print a few things by doing this, maybe is not what you wanted but is a start.
print("Dependency Parsing:")
result = dependency_parser.raw_parse(sentence)
#print (next(result))
dep = next(result)
print (list(dep.triples()))
Uncomment the line -> print(next(result)) if you want to see the entire output.
Upvotes: 1