How to Reduce Snippet Length and Minimize Token Usage in Cohere API Webhook Responses?

Question

Problem Description: I am integrating the Cohere API with Zapier to perform web searches and generate concise summaries. However, I am encountering an issue where the snippets in the webhook responses are extremely long, leading to high token consumption and inefficient processing. This significantly impacts the efficiency and cost-effectiveness of the integration.

1. Adjusting Parameters:

I have tried setting max_tokens to a lower value.
Experimented with different values for top_p and top_k.
Used return_snippets: false in the webhook configuration.

2. Configuration Example:**

{
  "model": "command-r-plus",
  "message": "Please provide a concise summary of what was happening in New York on 15 May 2024 using web search.",
  "temperature": 0.1,
  "max_tokens": 150,
  "top_p": 0.3,
  "top_k": 3,
  "frequency_penalty": 0.5,
  "connectors": [
    {
      "id": "web-search",
      "options": {
        "num_results": 1,
        "return_snippets": false
      }
    }
  ],
  "prompt_truncation": "AUTO",
  "citation_quality": "fast"
}

Despite these attempts, the snippets remain long, and the token usage is still high. This results in responses that are not as concise as needed and lead to excessive costs.

Expected Outcome: I am looking for a way to completely exclude or significantly reduce the length of snippets in the Cohere API webhook responses to ensure concise summaries and efficient token usage.

Endpoint: I am using the https://api.cohere.com/v1/chat endpoint for these requests.

How to Reduce Snippet Length and Minimize Token Usage in Cohere API Webhook Responses?

Answers (0)

Related Questions