mr.mindspace
mr.mindspace

Reputation: 109

ListBlobs does not list Deleted blobs

I am trying to list all deleted blobs from an Azure Storage Account. Here is my code:

using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Auth;
using Microsoft.WindowsAzure.Storage.Blob;

var blobClient = new CloudStorageAccount(new StorageCredentials("accountname", "accountkey"), true).CreateCloudBlobClient();
var container = blobClient.GetContainerReference("container");
var blobs = container.ListBlobs(useFlatBlobListing: true, blobListingDetails: BlobListingDetails.Deleted).ToList();

However, the result of ListBlobs is all non-deleted blobs in the container. In the Azure Portal I can clearly see there are many more deleted blobs in this container, but they are not being retrieved correctly.

How can I list only the blobs in deleted state in my container?

Edit:

I created a new container, with two blobs: test_deleted (which I deleted in the Azure Portal) and test_not_deleted. Using the newer Azure.Blob.Storage package, I now have the following code:

var client = new BlobServiceClient(new Uri($"https://{StorageAccountName}.blob.core.windows.net"), new StorageSharedKeyCredential(StorageAccountName, StorageAccountKey));
var container = client.GetBlobContainerClient("test");
var resultSegment = container.GetBlobsAsync(states: BlobStates.Deleted, traits: BlobTraits.All).AsPages(default, 5000);

var results = new List<BlobItem>();

await foreach (Azure.Page<BlobItem> blobPage in resultSegment)
{
    foreach (BlobItem blobItem in blobPage.Values)
    {
        results.Add(blobItem);
    }
}

The result contains only the not deleted blob.

Azure Portal screenshot

Results view

Upvotes: 1

Views: 1580

Answers (3)

Radek Myška
Radek Myška

Reputation: 1

I tried use blobstate to list uncommitted blobs.

But GetBlobsByHierarchyAsync with BlobStates.Uncommitted return same result as with BlobStates.None.

My code:

var blobHierarchyItems = await lobContainerClient
    .GetBlobsByHierarchyAsync(BlobTraits.None, blobStates);
Console.WriteLine("BlobStates " + blobStates);
foreach(var blobHierarchyItem in blobHierarchyItems)
{
    Console.WriteLine("Blob.Name " + blobHierarchyItem.Blob.Name);
    Console.WriteLine("Blob.Properties.LastModified " + blobHierarchyItem.Blob.Properties.LastModified);
}

Result:

BlobStates Uncommitted

Blob.Name testing_6.bin Blob.Properties.LastModified 25.03.2022 6:30:04 +00:00

Blob.Name testing_7.bin Blob.Properties.LastModified 25.03.2022 6:30:25 +00:00

Blob.Name testing_8.bin Blob.Properties.LastModified 25.03.2022 13:47:46 +00:00

BlobStates None

Blob.Name testing_6.bin Blob.Properties.LastModified 25.03.2022 6:30:04 +00:00

Blob.Name testing_7.bin Blob.Properties.LastModified 25.03.2022 6:30:25 +00:00

Blob.Name testing_8.bin Blob.Properties.LastModified 25.03.2022 13:47:46 +00:00

Do I something wrong? Thanks

Upvotes: 0

mr.mindspace
mr.mindspace

Reputation: 109

After a lot of headache and some help from this answer, I have figured out how to retrieve the deleted blobs. For some reason, if both versioning and soft-delete are enabled, the blobs you retrieve do not have the Deleted property set to true. Instead, their VersionId property will be null.

It seems that when BlobStates.DeletedWithVersions is used, all blobs are retrieved, but for deleted blobs, VersionId will be null. Here is the seemingly working code which retrieves all the blobs marked deleted:

var client = new BlobServiceClient(new Uri($"https://{StorageAccountName}.blob.core.windows.net"), new StorageSharedKeyCredential(StorageAccountName, StorageAccountKey));
var container = client.GetBlobContainerClient("test");
var resultSegment = container.GetBlobsAsync(states: BlobStates.DeletedWithVersions, traits:BlobTraits.All).AsPages(default, 5000);

var deletedBlobs = new List<BlobItem>();

await foreach (Azure.Page<BlobItem> blobPage in resultSegment)
{
    foreach (BlobItem blobItem in blobPage.Values)
    {
        if (blobItem.VersionId == null)
        {
            deletedBlobs.Add(blobItem);
        }
    }
}

In my case, I needed to know if a blob was deleted on a certain day. When a blob is deleted, a new version is created. So to find the deletion day, you need to retrieve everything with the blob's name, using BlobStates.Version, and check the VersionId property of the blobs, which is a date string. This will contain the date the version was created (i.e. the blob was deleted).

foreach (var deletedBlob in deletedBlobs)
{
    var versions = container.GetBlobs(BlobTraits.None, BlobStates.Version, prefix: deletedBlob.Name);

    foreach(var v in versions)
    { 
        if (deletedOn == DateTime.Parse(v.VersionId))
        {
            Console.WriteLine($"Blob {deletedBlob.Name} deleted on {deletedOn}");
        }
    }
}

Upvotes: 5

Peter Bons
Peter Bons

Reputation: 29780

You are using a very old NuGet package, you should upgrade to Azure.Storage.Blobs.

Then use paging to get all the blobs as shown on the docs:

private static async Task ListBlobsFlatListing(BlobContainerClient blobContainerClient, 
                                               int? segmentSize)
{
    try
    {
        // Call the listing operation and return pages of the specified size.
        var resultSegment = blobContainerClient.GetBlobsAsync()
            .AsPages(default, segmentSize);

        // Enumerate the blobs returned for each page.
        await foreach (Azure.Page<BlobItem> blobPage in resultSegment)
        {
            foreach (BlobItem blobItem in blobPage.Values)
            {
                Console.WriteLine("Blob name: {0}", blobItem.Name);
            }

            Console.WriteLine();
        }
    }
    catch (RequestFailedException e)
    {
        Console.WriteLine(e.Message);
        Console.ReadLine();
        throw;
    }
}

Use the optional blobstates parameter to specify you want to list deleted blobs

Upvotes: 1

Related Questions