newuser
newuser

Reputation: 31

create_csv_agent with HuggingFace: could not parse LLM output

I am using Langchain and applying create_csv_agent on a small csv dataset to see how well can google/flan-t5-xxl query answers from tabular data. As of now, I am experiencing the problem of '
OutputParserException: Could not parse LLM output: `0`'

> Entering new AgentExecutor chain...
---------------------------------------------------------------------------
OutputParserException                     Traceback (most recent call last)
<ipython-input-13-f86336065d8e> in <cell line: 1>()
----> 1 agent.run('how many rows are there?')

7 frames
/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in run(self, callbacks, tags, metadata, *args, **kwargs)
    473             if len(args) != 1:
    474                 raise ValueError("`run` supports only one positional argument.")
--> 475             return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
    476                 _output_key
    477             ]

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in __call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    280         except (KeyboardInterrupt, Exception) as e:
    281             run_manager.on_chain_error(e)
--> 282             raise e
    283         run_manager.on_chain_end(outputs)
    284         final_outputs: Dict[str, Any] = self.prep_outputs(

/usr/local/lib/python3.10/dist-packages/langchain/chains/base.py in __call__(self, inputs, return_only_outputs, callbacks, tags, metadata, include_run_info)
    274         try:
    275             outputs = (
--> 276                 self._call(inputs, run_manager=run_manager)
    277                 if new_arg_supported
    278                 else self._call(inputs)

/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _call(self, inputs, run_manager)
   1034         # We now enter the agent loop (until it returns something).
   1035         while self._should_continue(iterations, time_elapsed):
-> 1036             next_step_output = self._take_next_step(
   1037                 name_to_tool_map,
   1038                 color_mapping,

/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)
    842                 raise_error = False
    843             if raise_error:
--> 844                 raise e
    845             text = str(e)
    846             if isinstance(self.handle_parsing_errors, bool):

/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in _take_next_step(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)
    831 
    832             # Call the LLM to see what to do.
--> 833             output = self.agent.plan(
    834                 intermediate_steps,
    835                 callbacks=run_manager.get_child() if run_manager else None,

/usr/local/lib/python3.10/dist-packages/langchain/agents/agent.py in plan(self, intermediate_steps, callbacks, **kwargs)
    455         full_inputs = self.get_full_inputs(intermediate_steps, **kwargs)
    456         full_output = self.llm_chain.predict(callbacks=callbacks, **full_inputs)
--> 457         return self.output_parser.parse(full_output)
    458 
    459     async def aplan(

/usr/local/lib/python3.10/dist-packages/langchain/agents/mrkl/output_parser.py in parse(self, text)
     50 
     51         if not re.search(r"Action\s*\d*\s*:[\s]*(.*?)", text, re.DOTALL):
---> 52             raise OutputParserException(
     53                 f"Could not parse LLM output: `{text}`",
     54                 observation=MISSING_ACTION_AFTER_THOUGHT_ERROR_MESSAGE,

OutputParserException: Could not parse LLM output: `0`

I am not sure why that is the case since the prompt template seems to understand well what its function is supposed to be. Below is my code:

import os
from langchain import PromptTemplate, HuggingFaceHub, LLMChain, OpenAI, SQLDatabase, HuggingFacePipeline
from langchain.agents import create_csv_agent
from langchain.chains.sql_database.base import SQLDatabaseChain
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoConfig
import transformers

model_id = 'google/flan-t5-xxl'
config = AutoConfig.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id, config=config)
pipe = pipeline('text2text-generation',
                model=model,
                tokenizer=tokenizer,
                max_length = 1024
                )
local_llm = HuggingFacePipeline(pipeline = pipe)

agent = create_csv_agent(llm = hf_llm, path = "dummy_data.csv", verbose=True)
agent.run('how many unique status are there?')

I tried to experiment with lighter versions of Flan-t5 and OpenAI. However, for OpenAI, i keep hitting limit rate even when I only ran 1 query. And there's not much documentation on create_csv_agent beyond just OpenAI

Upvotes: 3

Views: 2806

Answers (2)

aadil gani
aadil gani

Reputation: 44

The best way to solve this error is to include the kwargs in the model Like this:

Set up the LLM

llm = HuggingFaceHub(repo_id="google/flan-t5-xxl", huggingfacehub_api_token= '**************', model_kwargs={"temperature":0.1, "max_length":512})

Upvotes: 1

ZKS
ZKS

Reputation: 2816

  1. The correct solution to this problem is writing your own custom output parser.

  2. Since you are using agent, parameter "handle_parsing_errors=True" does not have any effect.

  3. Other option would be chaining new LLM that would parse this output

  4. Temporary work around would be as below

     try:
             response= agent.run("how many unique statuses are there?")
     except Exception as e:
             response = str(e)
             if response.startswith("Could not parse LLM output: `"):
                 response = response.removeprefix("Could not parse LLM output: `").removesuffix("`")
                 print(response)
    

Upvotes: 2

Related Questions