Reputation: 31
I am using Langchain and applying create_csv_agent on a small csv dataset to see how well can google/flan-t5-xxl query answers from tabular data. As of now, I am experiencing the problem of '
OutputParserException: Could not parse LLM output: `0`'
> Entering new AgentExecutor chain...
---------------------------------------------------------------------------
OutputParserException Traceback (most recent call last)
<ipython-input-13-f86336065d8e> in <cell line: 1>()
----> 1 agent.run('how many rows are there?')
7 frames
/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in run(self, callbacks, tags, metadata, *args, **kwargs)
473 if len(args) != 1:
474 raise ValueError("`run` supports only one positional argument.")
--> 475 return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
476 _output_key
477 ]
/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in __call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
280 except (KeyboardInterrupt, Exception) as e:
281 run_manager.on_chain_error(e)
--> 282 raise e
283 run_manager.on_chain_end(outputs)
284 final_outputs: Dict[str, Any] = self.prep_outputs(
/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in __call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
274 try:
275 outputs = (
--> 276 self._call(inputs, run_manager=run_manager)
277 if new_arg_supported
278 else self._call(inputs)
/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _call(self, inputs, run_manager)
1034 # We now enter the agent loop (until it returns something).
1035 while self._should_continue(iterations, time_elapsed):
-> 1036 next_step_output = self._take_next_step(
1037 name_to_tool_map,
1038 color_mapping,
/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)
842 raise_error = False
843 if raise_error:
--> 844 raise e
845 text = str(e)
846 if isinstance(self.handle_parsing_errors, bool):
/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)
831
832 # Call the LLM to see what to do.
--> 833 output = self.agent.plan(
834 intermediate_steps,
835 callbacks=run_manager.get_child() if run_manager else None,
/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in plan(self, intermediate_steps, callbacks, **kwargs)
455 full_inputs = self.get_full_inputs(intermediate_steps, **kwargs)
456 full_output = self.llm_chain.predict(callbacks=callbacks, **full_inputs)
--> 457 return self.output_parser.parse(full_output)
458
459 async def aplan(
/usr/local/lib/python3.10/dist-packages/langchain/agents/mrkl/output_parser.py in parse(self, text)
50
51 if not re.search(r"Action\s*\d*\s*:[\s]*(.*?)", text, re.DOTALL):
---> 52 raise OutputParserException(
53 f"Could not parse LLM output: `{text}`",
54 observation=MISSING_ACTION_AFTER_THOUGHT_ERROR_MESSAGE,
OutputParserException: Could not parse LLM output: `0`
I am not sure why that is the case since the prompt template seems to understand well what its function is supposed to be. Below is my code:
import os
from langchain import PromptTemplate, HuggingFaceHub, LLMChain, OpenAI, SQLDatabase, HuggingFacePipeline
from langchain.agents import create_csv_agent
from langchain.chains.sql_database.base import SQLDatabaseChain
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoConfig
import transformers
model_id = 'google/flan-t5-xxl'
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id, config=config)
pipe = pipeline('text2text-generation',
model=model,
tokenizer=tokenizer,
max_length = 1024
)
local_llm = HuggingFacePipeline(pipeline = pipe)
agent = create_csv_agent(llm = hf_llm, path = "dummy_data.csv", verbose=True)
agent.run('how many unique status are there?')
I tried to experiment with lighter versions of Flan-t5 and OpenAI. However, for OpenAI, i keep hitting limit rate even when I only ran 1 query. And there's not much documentation on create_csv_agent beyond just OpenAI
Upvotes: 3
Views: 2806
Reputation: 44
The best way to solve this error is to include the kwargs in the model Like this:
llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", huggingfacehub_api_token= '**************', model_kwargs={"temperature":0.1, "max_length":512})
Upvotes: 1
Reputation: 2816
The correct solution to this problem is writing your own custom output parser.
Since you are using agent, parameter "handle_parsing_errors=True" does not have any effect.
Other option would be chaining new LLM that would parse this output
Temporary work around would be as below
try:
response= agent.run("how many unique statuses are there?")
except Exception as e:
response = str(e)
if response.startswith("Could not parse LLM output: `"):
response = response.removeprefix("Could not parse LLM output: `").removesuffix("`")
print(response)
Upvotes: 2