Inconsistent Tool Calling Behavior with LLaMA 3.1 70B Model on AWS Bedrock

Question

I am using the LLaMA 3.1 70B Instruct model via AWS Bedrock with LangChain for agent-based function calling. While testing, I observed the following issues:

Inconsistent Tool Calling: The model often selects incorrect tools for similar queries, even with detailed function descriptions provided. Tool selection varies unpredictably.

Limited Function Calls: The model does not call more than three tools, even when queries require more.

Sequence Issues: It fails to follow logical function call sequences, such as retrieving dependent data first.

from langchain_aws import ChatBedrock

    model_id="meta.llama3-1-70b-instruct-v1:0",
    verbose=True,
    beta_use_converse_api=True,
    model_kwargs={
        "temperature": 0.05,
        "stopSequences": ["Observation:", "Action:"]
    },
    disable_streaming='tool_calling'
)```

```def get_item_id(item_name):
    """Retrieve item ID based on name."""
    if "abc" in item_name.lower():
        return 1
    elif "xyz" in item_name.lower():
        return 2
    return 0```

```def get_item_details(item_id):
    """Retrieve details of an item."""
    if item_id == 1:
        return {"name": "ABC Item", "price": 100}
    elif item_id == 2:
        return {"name": "XYZ Item", "price": 200}
    return {}```

```def get_item_reviews(item_id):
    """Retrieve reviews of an item."""
    if item_id == 1:
        return ["Good quality", "Value for money"]
    elif item_id == 2:
        return ["Highly recommended", "Durable"]
    return []```

Expected Behavior:

The model should:

Call `get_item_id` to retrieve the ID for "XYZ".
Use the returned ID to call `get_item_details` and `get_item_reviews`.

Actual Behavior:

For some queries, the model correctly calls `get_item_id `followed by `get_item_details`and `get_item_reviews`. However, for some cases, it skips `get_item_id `and directly calls `get_item_details `with no valid ID, such as `{'item_id': 'get_item_id', 'item_name': 'XYZ'}`. Additionally, it limits function calls to only two tools, even if more are required to fully answer the query. Even though I have clearly defined the flow in the system prompt with detailed descriptions and few-shot examples, the model struggles to handle multiple function calls in complex queries.

These issues limit the model's capability for complex workflows. I need guidance to resolve them.

Inconsistent Tool Calling Behavior with LLaMA 3.1 70B Model on AWS Bedrock

Answers (0)

Related Questions