Matthew Gerges
Matthew Gerges

Reputation: 21

How To Execute Custom Actions with ChatGPT Assistants API

I am trying to create a GPT chatbot for restaurants that can ask customers for their contact info and time of reservation. After the AI chatbot is sure the customer has provided it with all of these details, I want to run what I believe is called an "action". I basically want to use another API to send an email to the person saying we will contact you shortly. In the meantime, however, I just want to do a console.log that says "confirmed" just so I know the AI understands it got all the details and can therefore proceed to the next step (just an intermediary step). However, I'm struggling with how I can move from my current code where it just has a chat with the user based on a specific assistant to actually executing actions. Here is my code (running on Node in the backend that just receives responses from a frontend and sends them back):

const express = require('express');
const { OpenAI } = require('openai');
const cors = require('cors');
require('dotenv').config();

const app = express();
app.use(cors());
app.use(express.json());

const openai = new OpenAI(process.env.OPENAI_API_KEY);

app.post('/get-response', async (req, res) => {
    const userMessage = req.body.message;
    let threadId = req.body.threadId; // Receive threadId from the client
    const assistantId = 'MYASSISTANTID'; // Replace with your actual assistant ID

    // If no threadId or it's a new session, create a new thread
    if (!threadId) {
        const thread = await openai.beta.threads.create();
        threadId = thread.id;
    }

    await openai.beta.threads.messages.create(threadId, {
        role: "user",
        content: userMessage,
    });


    // Use runs to wait for the assistant response and then retrieve it
    const run = await openai.beta.threads.runs.create(threadId, {
        assistant_id: assistantId,
    });

    let runStatus = await openai.beta.threads.runs.retrieve(
        threadId,
        run.id
      );

      // Polling mechanism to see if runStatus is completed
      // This should be made more robust.
      while (runStatus.status !== "completed") {
        await new Promise((resolve) => setTimeout(resolve, 2000));
        runStatus = await openai.beta.threads.runs.retrieve(threadId, run.id);
      }


  //     //CHECKING FOR TABLE RESERVATION:
  //         // If the model output includes a function call
  //   if (runStatus.status === 'requires_action') {
  //     // You might receive an array of actions, iterate over it
  //     for (const action of runStatus.required_action.submit_tool_outputs.tool_calls) {
  //         const functionName = action.function.name;
  //         const arguments = JSON.parse(action.function.arguments);
          
  //         // Check if the function name matches 'table_reservation'
  //         if (functionName === 'table_reservation') {
  //             handleTableReservation(arguments);
  //             // Respond back to the model that the action has been handled
  //             await openai.beta.threads.runs.submit_tool_outputs(threadId, run.id, {
  //                 tool_outputs: [{
  //                     tool_call_id: action.id,
  //                     output: { success: true } // You can include more details if needed
  //                 }]
  //             });
  //         }
  //     }
  // }


      // Get the last assistant message from the messages array
      const messages = await openai.beta.threads.messages.list(threadId);

      // Find the last message for the current run
      const lastMessageForRun = messages.data
        .filter(
          (message) => message.run_id === run.id && message.role === "assistant"
        )
        .pop();

      // If an assistant message is found, console.log() it
      assistantMessage = ""
      if (lastMessageForRun) {
        assistantMessage = lastMessageForRun.content[0].text.value
        console.log(`${assistantMessage} \n`);
      }
    
    res.json({ message: assistantMessage, threadId: threadId });
});

const PORT = 3001;
app.listen(PORT, () => console.log(`Server listening on port ${PORT}`));

If you look at my code above, you'll realize that I tried doing what I am asking about and then ended up commenting it out because it did not work.

For further context, I was trying to understand how actions and tools work as maybe this is the way I might be able to achieve what I am trying to do. And I came up with the following code that I think might be useful (the problem is I don't know how to combine the 2 pieces of code and the code below doesn't use an assistant which I eventually want to end up using):

require('dotenv').config(); // This should be at the top of your file

const { OpenAI } = require('openai');
const openai = new OpenAI(process.env.OPENAI_API_KEY);


// Example dummy function hard coded to return the same weather
// In production, this could be your backend API or an external API
function getCurrentWeather(location) {
  if (location.toLowerCase().includes("tokyo")) {
    return JSON.stringify({ location: "Tokyo", temperature: "10", unit: "celsius" });
  } else if (location.toLowerCase().includes("san francisco")) {
    return JSON.stringify({ location: "San Francisco", temperature: "72", unit: "fahrenheit" });
  } else if (location.toLowerCase().includes("paris")) {
    return JSON.stringify({ location: "Paris", temperature: "22", unit: "fahrenheit" });
  } else {
    return JSON.stringify({ location, temperature: "unknown" });
  }
}

function get_table_reservations(bookingTime, numGuests) {
  if (bookingTime.toLowerCase().includes("4:30")) {
    return JSON.stringify({ availability: "Not available"});
  }
  else if (!bookingTime) {
    return JSON.stringify({ availability: "Please include a booking time"});
  }
  else {
    return JSON.stringify({ availability: "Available", forGuests: numGuests});
}
}


async function runConversation() {
  // Step 1: send the conversation and available functions to the model
  const messages = [
    { role: "user", content: "I want a table reservation for 3 people." },
  ];
  const tools = [
    {
      type: "function",
      function: {
        name: "get_current_weather",
        description: "Get the current weather in a given location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA",
            },
            unit: { type: "string", enum: ["celsius", "fahrenheit"] },
          },
          required: ["location"],
        },
      },
    },
    {
      type: "function",
      function: {
        name: "get_table_reservations",
        description: "Tell the user if a table is available for the number of guests and time they request",
        parameters: {
          type: "object",
          properties: {
            numGuests: {
              type: "integer",
              description: "The number of guests",
            },
            bookingTime: { type: "string", description: "The time requested for a reservation, eg. 8:30 PM" },
          },
          required: ["numGuests", "bookingTime"],
        },
      },
    },
  ];


  const response = await openai.chat.completions.create({
    model: "gpt-3.5-turbo-1106",
    messages: messages,
    tools: tools,
    tool_choice: "auto", // auto is default, but we'll be explicit
  });
  const responseMessage = response.choices[0].message;

  // Step 2: check if the model wanted to call a function
  const toolCalls = responseMessage.tool_calls;
  if (responseMessage.tool_calls) {
    // Step 3: call the function
    // Note: the JSON response may not always be valid; be sure to handle errors
    const availableFunctions = {
      get_current_weather: getCurrentWeather,
      get_table_reservations: get_table_reservations
    }; // only one function in this example, but you can have multiple
    messages.push(responseMessage); // extend conversation with assistant's reply
    for (const toolCall of toolCalls) {
      const functionName = toolCall.function.name;
      const functionToCall = availableFunctions[functionName];
      const functionArgs = JSON.parse(toolCall.function.arguments);
      console.log('Arguments:', toolCall.function.arguments, 'name:', functionName); // Add this line to debug
      const functionResponse = functionToCall(
        functionArgs.bookingTime,
        functionArgs.numGuests
      );
      messages.push({
        tool_call_id: toolCall.id,
        role: "tool",
        name: functionName,
        content: functionResponse,
      }); // extend conversation with function response
    }
    const secondResponse = await openai.chat.completions.create({
      model: "gpt-3.5-turbo-1106",
      messages: messages,
    }); // get a new response from the model where it can see the function response
    return secondResponse.choices;
  }
}


runConversation().then(console.log).catch(console.error);

Alternatively, maybe there's a much easier way to do this through the platform.openai website itself on the assistants page. Maybe I need to change/ add something to the area in the screenshot below, perhaps a function. As a separate question, one of the examples of adding a function in the assistants API was "get_weather" but I'm not sure how this works and how the get_weather function will even run or where it needs to be defined (also shown in the second screenshot below).

Further, it would be a big help if someone could advise me how I can start using an email API to start sending emails once I have this step figured out (this part of my question is less important though)

Lastly (I know I'm asking a lot of questions), out of curiosity does anyone know if I can implement the same thing I am doing with GPT assistants with Gemini instead?

Assistants API Interface Example of a Function on the Assistants API Interface providing all of these details

Upvotes: 1

Views: 5187

Answers (2)

Sergio B.
Sergio B.

Reputation: 990

First of all you must create an assistant. The Python example is taken from OpenAI website:

from openai import OpenAI
client = OpenAI()

my_assistant = client.beta.assistants.create(
    instructions="You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
    name="Math Tutor",
    tools=[{"type": "code_interpreter"}],
    model="gpt-4",
)
print(my_assistant)

This is in case you want to use the code interpreter, if you want to use the Function Call tool use the code below, you must change the Tool array as in the example below that uses a function (get_weather) to retrieve the weather forecast and takes 2 parameters. You must add a description for the function and parameters to let ChatGPT know what the function does. It will understand from the descriptions if and when to call your function and which parameters to use and based on the result returned it will create the answer.

"tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Determine weather in my location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": [
                "c",
                "f"
              ]
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],

(the function declaration is actually a Json Schema) you can add more tools to the array. Retrieval is usually used to process files passed to ChatGPT.

WARNING ! ChatGPT doesn't know how to call your function, it will return a Json with the status "requires_action" and the name and parameters of the function to call:

....
"step_details": {
        "type": "tool_calls",
        "tool_calls": [
          {
            "id": "call_I11C5EH0M2ZFmHuaqgEu12wQ",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\":\"Rome\"}",

              "output": null
            }
          }
        ]
      },
.....

Where you find the function to call and the parameters to use for the call. Then you'll submit the result of the function call with a Json like this:

{
  "threadId": "thread_9AD4dCVxVO4bZaaZgbcSYo7W",
  "runId": "run_xO02eoY0ijpLQNslbtZLf9UJ",
  "functionResponses": {
    "tool_outputs": [
      {
        "tool_call_id": "call_I11C5EH0M2ZFmHuaqgEu12wQ",
        "output": "\"fine and sunny\""
      }
    ]
  }
}

You create a Thread that represents a chat with the following code snippet:

from openai import OpenAI
client = OpenAI()

empty_thread = client.beta.threads.create()
print(empty_thread)

ChatGPT will return a Json like this:

{
  "id": "thread_abc123",
  "object": "thread",
  "created_at": 1699012949,
  "metadata": {}
}

Now you have an assistant bound to a Thread (chat). You can optionally add messages when you create the Thread or you can add them using the Add Message API:

from openai import OpenAI
client = OpenAI()

thread_message = client.beta.threads.messages.create(
  "thread_abc123",
  role="user",
  content="What is the weather in Rome ?",
)
print(thread_message)

Now that you have an assistant and a thread you can process the messages you send to ChatGPT. To do that create a Run object as follows:

from openai import OpenAI
client = OpenAI()

run = client.beta.threads.runs.create(
  thread_id="thread_abc123",
  assistant_id="asst_abc123"
)
print(run)

You specify the Thread's ID and Assistant's ID that you received from the previous calls. At this point you'll receive the answer by first calling:

from openai import OpenAI
client = OpenAI()

run_steps = client.beta.threads.runs.steps.list(
    thread_id="thread_abc123",
    run_id="run_abc123"
)
print(run_steps)

To get the run steps (what is ChatGPT doing). The object it returns looks like the following:

{
  "id": "step_abc123",
  "object": "thread.run.step",
  "created_at": 1699063291,
  "run_id": "run_abc123",
  "assistant_id": "asst_abc123",
  "thread_id": "thread_abc123",
  "type": "message_creation",
  "status": "completed",
  "cancelled_at": null,
  "completed_at": 1699063291,
  "expired_at": null,
  "failed_at": null,
  "last_error": null,
  "step_details": {
    "type": "message_creation",
    "message_creation": {
      "message_id": "msg_abc123"
    }
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 456,
    "total_tokens": 579
  }
}

Check the status to see if it is equal to "completed". When it is completed get the messages using a code snippet similar to the following:

from openai import OpenAI
client = OpenAI()

message = client.beta.threads.messages.retrieve(
  message_id="msg_abc123",
  thread_id="thread_abc123",
)
print(message)

The call result will be the following Json:

{
  "id": "msg_abc123",
  "object": "thread.message",
  "created_at": 1699017614,
  "thread_id": "thread_abc123",
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": {
        "value": "This is the message from ChatGPT",
        "annotations": []
      }
    }
  ],
  "file_ids": [],
  "assistant_id": null,
  "run_id": null,
  "metadata": {}
}

The "content" fragment contains the answer of ChatGPT:

"content": [
        {
          "type": "text",
          "text": {
            "value": "This is the message from ChatGPT",
            "annotations": []
          }
        }
      ],

Actually it could return some Annotations too but it is a subject that goes beyond this example and you cand learn about it by reading OpenAI documentation at the following link: Assistants API

The Assistant API is very different from the old Chat Completion API because it keeps the context of the chat while the thread is active, you don't need to add the chat history at every call as it happened with Chat Completion, the Assistant will do it for you. In fact if you ask him to recall a previous question it will give you the answer correctly. Keep in mind that threads have an expiration time (given in the response Json). When it expires the context and all other data will be lost for good.

Also remember that when you call the Get Step API, the assistant could return you some partial answers to explain you what it is currently doing. Then at he end it will return you the final complete answer. When you send files keep an eye on the Annotations array that is returned in the content object. It could contain valuable information about the file's content.

Any questions ?

Upvotes: 3

Sergio B.
Sergio B.

Reputation: 990

Use the Assistant API. Declare a Function Call tool and describe each parameter of the function and the function too. If you give an accurate meaning for the function ChatGPT will return you the information necessary to call your function that will send the email. For node or Python you can find tons of libraries on Git and not only. Then start learning how to use the assistant API here: Assistant API

Upvotes: 0

Related Questions