Reputation: 9368
** Problem Background **
As we know, Azure WebJob SDK, has no way of defining a retention policy for logs. That means the execution or dashboard Blob storage can grow and impose problems including slowing down or crash the kudu Dashboard – which could compromise the stability of the other apps in the App Service plan.
The problem stated here:
https://github.com/Azure/azure-webjobs-sdk/issues/560
https://github.com/Azure/azure-webjobs-sdk/issues/1050
https://github.com/Azure/azure-webjobs-sdk/issues/107
My web job functions are extensively logging and they are running more than 100,000 times a day. That means I have a huge amount of log files piled up in my storage.
** The Workaround approach that I am planning: **
I am planning to add a time trigger Functions to my WebJob code that purges log entries older than 30 days.
We have the following blob containers created or used by the WebJobs SDK:
1.Storage Connection: AzureWebJobsDashboard
1.1. azure-webjobs-dashboard
1.2. azure-jobs-host-archive
1.3. Duplicates with AzureWebJobsStorage
1.3.1 azure-jobs-host-output
1.3.2 azure-webjobs-host
2.Storage AzureWebJobsStorage
2.1. azure-jobs-host-output
2.2. azure-webjobs-host
2.2.1 Heartbeats
2.2.2 Ids
2.2.3 Output-logs
I am thinking to create a process that deletes every file older than 30 days from above containers. But I am concern that some of the blobs might be required by the running WebJobs.
** Question **
Which of the above blob containers do I need to purge, to prevent blob file pile-up problem without interfering running WebJobs ?
Upvotes: 3
Views: 1712
Reputation: 11
We've just implemented Storeage Lifecycle Management and are testing this:
{
"version": "0.5",
"rules": [
{
"name": "DeleteOldLogs",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"delete": {
"daysAfterModificationGreaterThan": 30
}
}
},
"filters": {
"blobTypes": [
"blockBlob"
],
"prefixMatch": [
"azure-webjobs-host/output-logs",
"azure-webjobs-dashboard/functions/recent",
"azure-webjobs-dashboard/functions/instances",
"azure-jobs-host-output",
"azure-jobs-host-archive"
]
}
}
}
]
}
Upvotes: 1
Reputation: 28387
As far as I know, AzureWebJobsDashboard connection string account is used to store logs from the WebJobs Dashboard. This connection string is optional.
It will generate two container 'azure-webjobs-dashboard'and 'azure-jobs-host-archive'.
Azure-webjobs-dashboard: WebJob dashboard to store host and execution endpoint (function) details
Azure-jobs-host-archive: This is used as an archive for execution logs.
So both of these containers could be deleted without interfering running WebJobs.
azure-jobs-host-output is the key for troubleshooting web jobs. This container hosts logs created by the WebJob runtime during initialization and termination of every execution. If you don't want this log , you could delete it.
Azure-webjobs-host container in-turn hosts three directories:
Heartbeats – Containing 0 byte blogs for every heartbeat check performed on the service. If you don't want it, you could delete the old file.
Ids – Containing the directory with a single blog holding a unique identifier for this service.I don't suggest you delete this container's file.
Output-logs – Hosts the output of the explicit logs for each run. Explicit logs being logs introduced by WebJob developers within the execution code. You could delete the old log.
Upvotes: 3