Reputation: 3451
Being rather new to the Azure Durable Functions landscape, I am struggling to find the best way to handle downstream calls to an API that has rate limits implemented.
The way my flow is set up, is like below:
HttpTrigger
and DurableClient
bindings in the signature and calls the next orchestration function:OrchestrationTrigger
binding in the signature. This function will make a call to the API (await) and get a collection back. For every item in that collection, it will start a new Activity: context.CallActivityAsync()
(combining them in a List
and perform a Task.WhenAll()
)ActivityTrigger
binding in the signature. This function will have to make a call to the rate-limited API endpoint. And it's these activities I want to throttle (across multiple orchestrations).So, what I am looking for is an implementation to a throttling pattern:
I don't see a standard implementation to achieve this and I want to move as much of that Check/Sleep logic away from the main orchestration.
Would the best approach be to have a suborchestration implemented for every API call that has to be throttled, where the check has to happen, before the Activity is called?
Looking forward to any insights.
Upvotes: 5
Views: 3421
Reputation: 312
Or you could use ThrottlingTroll's egress rate limiting capabilities in your activity function.
Configure ThrottlingTroll-equipped HttpClient instance like this:
private static HttpClient ThrottledHttpClient = new HttpClient
(
new ThrottlingTrollHandler
(
async (limitExceededResult, httpRequestProxy, httpResponseProxy, cancellationToken) =>
{
var egressResponse = (IEgressHttpResponseProxy)httpResponseProxy;
egressResponse.ShouldRetry = true;
},
counterStore: new AzureTableCounterStore(),
// One request per each 5 seconds
new ThrottlingTrollEgressConfig
{
Rules = new[]
{
new ThrottlingTrollRule
{
LimitMethod = new FixedWindowRateLimitMethod
{
PermitLimit = 1,
IntervalInSeconds = 5
}
}
}
}
)
);
And then use it to make API calls. That HttpClient instance will limit itself, aka when the rate limit (in this example, 1 request per 5 seconds) is exceeded, it will automatically wait for the next chance to make a call (without making actual calls).
AzureTableCounterStore is used here for simplicity (doesn't require any external storages), but for production workloads I definitely recommend RedisCounterStore instead.
Here is the full example of a .NET 6 InProc Function project.
Upvotes: 0
Reputation: 197
I have a similar scenario like Sam, but with one main difference which complicates the problem even more.
In may case there is a multiple orchestrator functions calling each other in a nested way.
Each one of them basically does the same:
The idea is to copy an entire Rest API to Azure data lake as part of an ETL process. (The extract layer)
For example: Customers => Orders => Invoices
Now, I can’t use the above solutions (working with a queue to control the rate) because I want each orchestrator to wait to the activity result and only then do the nested calls based on the activity result.
I’m thinking on another solution, one that will work with a synchronized orchestration tree.
Create a wrapping orchestration function around the activity function we want to throttle. (we can't wait from activity) This function will do the following:
Create a separate client function that do the following:
That way the current setup should continue to work in the same way, and when there are no external events waiting the orchestrators will be unloaded from the worker by Azure until new event coming.
Upvotes: 0
Reputation: 11
There is a better way to do it. Instead of trying to limit it on your side based on concurrent activity functions or active http requests, why don't you rely on the API itself? It knows when it's time to return 429.
I would add a queue, grab a task, call the API, if there is 429, put the task message back into the queue with an exponential delay policy.
Upvotes: 1
Reputation: 60751
Sam, I see several options. I've also created a video as a response to this. Let's step back and see how we would do this using regular functions (not durable).
The first approach would be to turn the function into a queue-triggered function, and use the queue mechanism to control scale out, by using batchSize
and newBatchThreshold
:
The other way would be to have an http-triggered function and use this in the host.js file:
With durable functions we can do this:
You specifically asked regarding controlling scale-out per timespan, and this is how I would do this:
Upvotes: 2