Anton Cosenco
Anton Cosenco

Reputation: 11

How to Reduce Snippet Length and Minimize Token Usage in Cohere API Webhook Responses?

Problem Description: I am integrating the Cohere API with Zapier to perform web searches and generate concise summaries. However, I am encountering an issue where the snippets in the webhook responses are extremely long, leading to high token consumption and inefficient processing. This significantly impacts the efficiency and cost-effectiveness of the integration.

1. Adjusting Parameters:

2. Configuration Example:**

{
  "model": "command-r-plus",
  "message": "Please provide a concise summary of what was happening in New York on 15 May 2024 using web search.",
  "temperature": 0.1,
  "max_tokens": 150,
  "top_p": 0.3,
  "top_k": 3,
  "frequency_penalty": 0.5,
  "connectors": [
    {
      "id": "web-search",
      "options": {
        "num_results": 1,
        "return_snippets": false
      }
    }
  ],
  "prompt_truncation": "AUTO",
  "citation_quality": "fast"
}

Despite these attempts, the snippets remain long, and the token usage is still high. This results in responses that are not as concise as needed and lead to excessive costs.

Expected Outcome: I am looking for a way to completely exclude or significantly reduce the length of snippets in the Cohere API webhook responses to ensure concise summaries and efficient token usage.

Endpoint: I am using the https://api.cohere.com/v1/chat endpoint for these requests.

Upvotes: 1

Views: 39

Answers (0)

Related Questions