Reputation: 11
Why it is not recommended to use the AF to host or deploy the projects which is related to LLM like Chatbot or Chatassistance?
Is it just beauce of the Cold Start Latency? Like when the function is idel for a while. the next invocation might take longer to start, which can affect the responsiveness of the chat.
Is there any issue which the scalability? The AF can be auto scale, so I belive that there is no issue with the scalability!
Upvotes: -1
Views: 34
Reputation: 29840
I think there are some things to consider when using Azure Functions for chat completions and such. But lets address the issues mentioned in the post first.
Azure Functions have a cold start latency
True, but this only happens after a longer period of inactivity. It can be mitigated by using a Premium plan for example.
Scalability
However, for chatbots or chat assistance applications, the number of requests can vary greatly depending on the time of day or the number of users online. Scaling up and down can be a complex process, and it can lead to performance issues if not done correctly.
Using Azure Function, scaling out is done for you, so there is no complexity there to manage. Also, a varying number of requests is something Azure Functions is designed for and excels in.
Log-running executions
Long-running executions: Azure Functions have a maximum execution time of 10 minutes.
Well, a single chat completion won't take more than 10 minutes I do hope :-)
Costs
It totally depends, I think Azure Function could be very cost efficient when using it for chat completions
I do think there are some things to consider. For one, Azure Functions can be used, there are even a new set of bindings being developed that allow you to create chat completion output bindings for example.
The challenge is how to deal with chat history. If you want to allow the user to have a conversation the history of that conversations has to be passed to the LLM in order to continue flow. If you want to manage that yourself you could use Durable Functions, which can store state, or add a storage account / use storage account bindings.
The linked extensions seems to use the Azure OpenAI assistant API for conversations which is an option as well as it maintains the state inside the assistant.
Upvotes: 0