Thiện Sinh
Thiện Sinh

Reputation: 559

Delay time between Activity increasing when using Azure function chaining pattern

I have 3000 activities running in a line as the code bellow

The problem is that for the first hundred activities, it runs fast.

For the next hundred activities, it starts to delay before starting a new activity (delay 1 second between two activities)

For the last hundred activities, the delay time is almost 15 seconds.

It seems like the Azure durable function doesn't support the chaining which has a large amount of activity. Instead, we should move to use a fan-out pattern. But that doesn't fit my needs.

        [FunctionName("Trigger")]
        public static async Task<HttpResponseMessage> Run(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)] HttpRequestMessage req,
            [DurableClient] IDurableOrchestrationClient starter,
            ILogger log)
        {
            log.LogInformation("C# HTTP trigger function processed a request.");
            string instanceId = await starter.StartNewAsync("Orchestrator", null);
            log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
            return starter.CreateCheckStatusResponse(req, instanceId);
        }

        [FunctionName("Orchestrator")]
        public static async Task<List<string>> RunOrchestrator(
            [OrchestrationTrigger] IDurableOrchestrationContext context,
            ILogger log)
        {
            log.LogInformation($"XXX start Orc");
            var outputs = new List<string>();
            //var tasks = new List<Task<string>>();

            // Run activity in a line
            for (int i = 0; i < 3000; i++)
                outputs.Add(await context.CallActivityAsync<string>("Activity", $"Sinh{i + 1}"));

            //outputs.AddRange(await Task.WhenAll(tasks));
            log.LogInformation($"XXX stop Orc");
            return outputs;
        }

        [FunctionName("Activity")]
        public static string SayHello([ActivityTrigger] string name, ILogger log)
        {
            log.LogInformation($"XXX Saying hello to {name}.");
            return $"Hello {name}!";
        }

Any suggestions are highly appreciate

Upvotes: 1

Views: 1628

Answers (2)

Chris Gillum
Chris Gillum

Reputation: 15052

I expect that you can dramatically increase the speed of your orchestration by setting extendedSessionsEnabled to true in host.json. Some docs here: https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-perf-and-scale#extended-sessions

Extended sessions is a setting that keeps orchestrations and entities in memory even after they finish processing messages. The typical effect of enabling extended sessions is reduced I/O against the underlying durable store and overall improved throughput.

A bit more background: orchestrations are unloaded from memory every time you await a particular task for the first time. That means your orchestration is getting unloaded and reloaded 3000 times. Each time it loads back into memory, it needs to re-read its execution history from Azure Storage and then replay the orchestrator code to get back to its previous position. Each replay is going to be more costly because it has to iterate through more code and load more history rows into memory.

Extended sessions eliminates all the above replay behavior by preventing the orchestration from unloading its state. This means it never needs to replay nor does it need to reload the entire orchestration history at each new await. I definitely recommend it for both large fan-in/fan-outs and large sequences like in your example.

Upvotes: 2

Delliganesh Sevanesan
Delliganesh Sevanesan

Reputation: 4786

Use multiple worker processes:

By default, any host instance for Functions uses a single worker process. To improve performance, use the FUNCTIONS_WORKER_PROCESS_COUNT to increase the number of worker processes per host (up to 10).

enter image description here

Refer more here

Orchestration delay :

Orchestrations instances are started by putting an ExecutionStarted message in one of the task hub's control queues. Under certain conditions, you may observe multi-second delays between when an orchestration is scheduled to run and when it starts running. During the time of interval, the orchestration instance remains in the Pending state. There are two potential causes of this delay:

Backlogged control queues: The control queue of the instance contains a large number of messages, it may take time before the ExecutionStarted message is received and processed by the runtime. Message backlogs can happen when orchestrations are processing lots of events concurrently. Events that go into the control queue include orchestration start events, activity completions, durable timers, termination, and external events. If this delay happens under normal circumstances, consider creating a new task hub with a larger number of partitions. Configuring more partitions will cause the runtime to create more control queues for load distribution. Each partition corresponds to 1:1 with a control queue, with a maximum of 16 partitions.

BY default the number of partitions are four. If more partitions are needed you need to update the task hub configuration in host.json with a new partition count. The host will detect this change after it has been restarted.

Back off polling delays: Another common cause of orchestration delays is described here back-off polling behavior for control queues. However, this delay is only expected when an app is scaled out to two or more instances. If there is only one app instance or if the app instance that starts the orchestration is also the same instance that is polling the target control queue, then there will not be a queue polling delay. Back off polling delays can be reduced by updating the host.json settings, as described previously.

Refer orchestration delays

Upvotes: 1

Related Questions