Dale Myers
Dale Myers

Reputation: 2821

Calling Azure Function calls fail with "Function host is not running"

I've got a function which I've been playing with for a couple of months (even having some issue as can be seen here) and I've started running into issues when I start to scale up the usage from tests to "production".

The function takes in a 2 values, looks for a matching value in an Azure Table, removes it if it finds it, then adds the new values in together. This works fine in testing. As soon as I scale up, from a few calls every second, to 20-30 calls a second, it fails with the response mentioned above.

The actual issue, when I dive in using Insights, is that a System.InvalidOperationException exception is thrown. Here's the call stack:

System.InvalidOperationException:
   at Microsoft.Azure.WebJobs.Script.WebHost.SecretManager+<PersistSecretsAsync>d__27`1.MoveNext (Microsoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script.WebHost\Security\SecretManager.csMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 440)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Microsoft.Azure.WebJobs.Script.WebHost.SecretManager+<GetHostSecretsAsync>d__12.MoveNext (Microsoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script.WebHost\Security\SecretManager.csMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 104)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Microsoft.Azure.WebJobs.Script.WebHost.WebJobsSdkExtensionHookProvider+<GetOrCreateExtensionKey>d__6.MoveNext (Microsoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script.WebHost\WebHooks\WebJobsSdkExtensionHookProvider.csMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 71)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Microsoft.Azure.WebJobs.Script.WebHost.WebJobsSdkExtensionHookProvider.GetExtensionWebHookRoute (Microsoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script.WebHost\WebHooks\WebJobsSdkExtensionHookProvider.csMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 64)
   at Microsoft.Azure.WebJobs.Script.WebHost.WebJobsSdkExtensionHookProvider.GetUrl (Microsoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script.WebHost\WebHooks\WebJobsSdkExtensionHookProvider.csMicrosoft.Azure.WebJobs.Script.WebHost, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 49)
   at Microsoft.Azure.WebJobs.Host.Config.ExtensionConfigContext.GetWebhookHandler (Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
   at Microsoft.Azure.WebJobs.Extensions.EventGrid.EventGridExtensionConfig.Initialize (Microsoft.Azure.WebJobs.Extensions.EventGrid, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null)
   at Microsoft.Azure.WebJobs.Host.Executors.JobHostConfigurationExtensions.InvokeExtensionConfigProviders (Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
   at Microsoft.Azure.WebJobs.Host.Executors.JobHostConfigurationExtensions.CreateStaticServices (Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
   at Microsoft.Azure.WebJobs.JobHost.InitializeServices (Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35)
   at Microsoft.Azure.WebJobs.Script.Utility.CreateMetadataProvider (Microsoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Utility.csMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 362)
   at Microsoft.Azure.WebJobs.Script.ScriptHost.LoadBindingExtensions (Microsoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Host\ScriptHost.csMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 966)
   at Microsoft.Azure.WebJobs.Script.ScriptHost.Initialize (Microsoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Host\ScriptHost.csMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 299)
   at Microsoft.Azure.WebJobs.Script.ScriptHostManager.RunAndBlock (Microsoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Host\ScriptHostManager.csMicrosoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 178)

The message with it is:

Repository has more than 10 non-decryptable secrets backups (host). 

I unfortunately have no idea what that means. Any searches for this result in just a few threads which talk about regenerating the keys, but again, I don't really know what that means. Some threads mention moving back to V1 of the functions, but I'm already on V1, so that's no an option.

What is going on with this function and how do I fix it?

For any Azure employees looking at this, my function ID is:

2019-01-18T15:52:18.658 [Info] Function started (Id=fc6850e8-7554-46d8-81ec-4d1697c7b572)

Upvotes: 5

Views: 8797

Answers (6)

sschmeck
sschmeck

Reputation: 7685

As mentioned by Divya, you should delete the snapshots files in the associated Storage Account, see AZFD0007: Repository has more than 10 nondecryptable secrets backups.

Whenever the Functions host is unable to decrypt this repository file, it and regenerates the repository file and creates a backup of the unreadable file with a name like host.snapshot..json. [..] When 10 unreadable secrets repository backup files exist, the secrets repository can't be regenerated and your function app might not start or run correctly. [..] To resolve this error, delete one or more of the repository backups (host.snapshot..json) from the azure-webjobs-secrets<FUNCTION_APP_NAME> container in the storage account used by your function app.

Therefore, you need to check the container azure-webjobs-secrets and delete the snapshot files.

# Inspect the secret files
az storage blob list \
  --account-name <function-app-storage-account> \
  --container-name azure-webjobs-secrets \
  --query '[].name' \
  --output tsv

# Delete the snapshot files
az storage blob delete-batch \
  --account-name <function-app-storage-account> \
  --source azure-webjobs-secrets \
  --pattern '*.snapshot.*.json'

Upvotes: 1

Rakhshanda Khan
Rakhshanda Khan

Reputation: 21

The reason for this exception is any one or both of below:

  • If you delete the app and recreate it with same name, it will use the same file share of the app and as the function app needs the keys, it will create a backup inside this folder as a snapshot.
  • If you have multiple function apps that uses the same storage account, this can cause this.

The maximum limit of the snapshots/backups that the platform will take is 10. Post that it will start throwing this error.

To solve this issue, you need to take the backup of this folder and delete “.snapshot..json” files or secret folder itself to regenerate secret files.

You can also delete the backup files from the storage account and navigate to the container and delete the existing snapshots.json file.

If runtime can’t decrypt secrets then they will be regenerated and non-decryptable secrets will be stored in “.snapshot..json” files. This Exception start firing if number of snapshots > 10.

You may have to kill the process and restart the Azure function after deleting the secret backup.

Upvotes: 0

Divya
Divya

Reputation: 436

Little addition to what the solutions are mentioned above. I was facing this same issue and even after deleting secret files from D:\home\data\Functions\secrets folder on KUDU site, I was still getting this same error.

I was able to fix this by deleting host.*.snapshot.*.json files from azure-webjobs-secrets folder, which was there in my blob storage on azure. Basically there were 10 such snapshot files.

Upvotes: 3

klenium
klenium

Reputation: 2607

If you have a custom Startup, there might be an error that blocks the host to start.

In the portal, go to the function's page (the one where you can see the content of function.json), there will be (likely) an error message. From this, you can get an idea where to look for the error. In my case, an exception was thrown when I tried to connect to the Azure Key Vault from the startup class, but List permission was not given to the function.

Upvotes: 1

Farrukh Normuradov
Farrukh Normuradov

Reputation: 1184

Deleting D:\home\data\Functions\secrets solved the issue.

Step 0 Step 1 Step 2 Step 3

In general I learned that whatever strange behavior you get from Azure Function, Kudu is always your best tool for investigation.

Upvotes: 3

Thuc Nguyen
Thuc Nguyen

Reputation: 1661

The message indicated that this is something to do with the host-level keys (secrets) in your Function.

So even though I don't have a clear fix to this (as I have never experienced this issue), I would suggest that you check the host.json in D:\home\data\Functions\secrets folder and see if anything unusual there, e.g. there are more than 10 keys - as the error message indicated.

Upvotes: 1

Related Questions