Reputation: 9368
Scenario:
An azure function hosted on an app service plan and scaled out to 5 instances. The Azure function is triggered by Blob.
Question:
Is there any documentation that explains the mechanism that prevents a Scaled out Azure Function process the same blob multiple times? I am asking because there is more than one instance of the function is running.
Upvotes: 3
Views: 899
Reputation: 17800
Agree with@Peter, here are my understandings for references, correct me if it doesn't make sense.
Blob trigger mechanism related info is stored in the Azure storage account for our Function app (defined by the app setting AzureWebJobsStorage
). Locks locate in a blob container named azure-webjobs-hosts
and there's a queue azure-webjobs-blobtrigger-<FunctionAppName>
for internal use.
See another part in the same comment.
Normally only 1 of N host instances is scanning for new blobs (based on a singleton host id lock). When it finds a new blob it adds a queue message for it and one of the N hosts processes it.
So in the first step--scanning for new blobs, scale out feature doesn't participate. The singleton host id lock is implemented by blob lease as @Peter mentioned (check blob locks/<FunctoinAppName>/host
in azure-webjobs-hosts
).
Once internal queue starts receiving messages of new blobs, scale out feature begins to work as host instances fetch and process messages together. When a blob message is being processed it can't be seen by other instances and would be deleted later.
Besides, to ensure that blob processed never triggers function later(e.g. in next turn of scanning), another mechanism is blob receipts.
Upvotes: 3
Reputation: 29840
As far as I can tell blob leases are used.
It is backed by this comment made by a MS engineer working on the Azure Functions team.
The singleton mechanism used under the covers to ensure only one host processes a blob is based on the HostId. In regular scale out scenarios, the HostId is the same for all instances, so they collaborate via blob leases behind the scenes using the same lock blob scoped to the host id.
Upvotes: 3