Reputation: 491
I am new to Durable Azure Functions and want to verify my understanding on what it means to be deterministic.
The flow is something like this:
My current approach is along the lines of the code below. However, the more I think about it the more I think my current approach is not deterministic. Entries being flagged in table A by the first activity might no longer be flagged if you run the activity some time later (e.g. entry no longer meets criteria to be flagged). That would mean the list returned by the second activity could also differ if the data in table A changed.
Would it be sufficient to change my first activity to return the IDs of the entries in table A that are flagged, and use that list as input for the second activity? To me it then looks similar to this example:
What I don't really understand in all of this, if you rerun the example in the Microsoft docs the E2_GetFileList
could potentially return different files because new files might be added or existing removed. So how is that deterministic?
public class DeterministicOrchestrator
{
[Function("DeterministicOrchestratorApi")]
public async Task<HttpResponseData> RunApi(
[HttpTrigger] HttpRequestData request,
[DurableClient] DurableTaskClient durableTaskClient)
{
var referenceDate = new DateOnly(2023, 4, 3);
var orchestrationInstanceId = await durableTaskClient
.ScheduleNewOrchestrationInstanceAsync("DeterministicOrchestrator", referenceDate)
.ConfigureAwait(false);
return durableTaskClient.CreateCheckStatusResponse(request, orchestrationInstanceId);
}
[Function("DeterministicOrchestrator")]
public async Task Run(
[OrchestrationTrigger] TaskOrchestrationContext taskOrchestrationContext,
DateOnly referenceDate)
{
var wrappedDateOnly = new WrappedDateOnly { DateOnly = referenceDate };
await taskOrchestrationContext
.CallActivityAsync("FlagDatabaseEntries", wrappedDateOnly)
.ConfigureAwait(true);
var events = await taskOrchestrationContext
.CallActivityAsync<string[]>("CreateEvents", wrappedDateOnly)
.ConfigureAwait(true);
// Fan-out/fan-in
var eventTasks = events
.Select(x => taskOrchestrationContext.CallActivityAsync("ProcessEvent", input: x))
.ToList();
await Task.WhenAll(eventTasks).ConfigureAwait(true);
await taskOrchestrationContext
.CallActivityAsync("Export", wrappedDateOnly)
.ConfigureAwait(true);
}
[Function("FlagDatabaseEntries")]
public Task FlagDatabaseEntries([ActivityTrigger] WrappedDateOnly referenceDate)
{
// Flags entries in database table A to be processed using given referenceDate.
return Task.CompletedTask;
}
[Function("CreateEvents")]
public Task<string[]> CreateEvents([ActivityTrigger] DateOnly referenceDate)
{
// Creates events based on the entries flagged in the database by previous activity.
return Task.FromResult(Array.Empty<string>());
}
[Function("ProcessEvent")]
public Task ProcessEvent([ActivityTrigger] string eventToProcess)
{
// Process event and some of the events result in data being added to database table B.
return Task.CompletedTask;
}
[Function("Export")]
public Task Export([ActivityTrigger] DateOnly referenceDate)
{
// Export data from the database populated by processing the events.
return Task.CompletedTask;
}
}
public class WrappedDateOnly
{
public DateOnly DateOnly { get; set; }
}
Upvotes: 2
Views: 957
Reputation: 58743
So the "deterministic" requirement in case of Durable Functions orchestrators only exists because the Durable Task framework executes the code several times as results come in from activities. So with the same inputs to the orchestrator + same outputs (+ events etc.), the orchestrator code should always go through the same steps. Your orchestrator looks deterministic in this sense.
The issue you are referring to is something to consider though. I think what you suggest makes sense. Returning the ids to process is a common pattern that I've used.
One thing that might possibly need clarification is also that activities don't run again during replay. After an activity has returned the result, it will be read from Table Storage instead of calling the activity again.
Small thing to note, you don't need ConfigureAwait()
on any of the calls.
Upvotes: 2