Reputation: 315
I am just getting started with output-parsers and I'm impressed with their usefulness when they work properly. I have, however, run into a case where every now and then, a chain returns an error that seems to be related to the JsonOutputParser that I use, as indicated by the following (condensed) error message:
JSONDecodeError
JsonOutputParser.parse_result(self, result, partial)
156 # Parse the JSON string into a Python dictionary
--> 157 parsed = parser(json_str)
159 return parsed
122 # If we got here, we ran out of characters to remove
123 # and still couldn't parse the string as JSON, so return the parse error
124 # for the original string.
--> 125 return json.loads(s, strict=strict)
According to this post here this could be related to there not being "enough tokens left to fully generate my output", which seems to be in line with the error message above:
122 # If we got here, we ran out of characters to remove
although I am not fully sure what that means or how it can be fixed.
Has anybody encountered this problem before and could offer some guidance? I must admit that I'm feeling kind of stumped, especially since the error can't be reproduced reliably and only occurs every other time I run my script.
Upvotes: 0
Views: 575
Reputation: 2621
Examining the unit tests for JsonOutputParser.parse_partial_json()
will help you understand what's going on.
TEST_CASES_PARTIAL = [
('{"foo": "bar", "bar": "foo"}', '{"foo": "bar", "bar": "foo"}'),
('{"foo": "bar", "bar": "foo', '{"foo": "bar", "bar": "foo"}'),
('{"foo": "bar", "bar": "foo}', '{"foo": "bar", "bar": "foo}"}'),
('{"foo": "bar", "bar": "foo[', '{"foo": "bar", "bar": "foo["}'),
('{"foo": "bar", "bar": "foo\\"', '{"foo": "bar", "bar": "foo\\""}'),
('{"foo": "bar", "bar":', '{"foo": "bar"}'),
('{"foo": "bar", "bar"', '{"foo": "bar"}'),
('{"foo": "bar", ', '{"foo": "bar"}'),
]
@pytest.mark.parametrize("json_strings", TEST_CASES_PARTIAL)
def test_parse_partial_json(json_strings: Tuple[str, str]) -> None:
case, expected = json_strings
parsed = parse_partial_json(case)
assert parsed == json.loads(expected)
parse_partial_json()
attempts to parse invalid JSON strings by truncating characters from the string until it's able to parse a valid JSON string.
{"foo": "bar", "bar" # original LLM response
is truncated to...
{"foo": "bar"} # valid JSON response
This is a best-effort approach to parsing invalid JSON responses from LLMs. As we know, LLMs are not guaranteed to return valid JSON. When the function is not able to parse the LLM response, the original parsing error is returned, which is the error you received.
parse_partial_json()
.JsonOutputParser.parse_partial_json()
(GitHub)JsonOutputParser.parse_partial_json()
unit test (GitHub)Upvotes: 1