Johannes Egger
Johannes Egger

Reputation: 4041

Check if URL points to Azure Blob

If I have a random URL, how would I check if it points to a blob in Azure? If it points to an Azure Blob I can be sure that this blob exists and that I can access it.
I tried the following:

  1. I made some tests with Exists, but it's not very satisfying:

    new CloudBlockBlob(new Uri("http://example.com"))
    ArgumentException: Invalid blob address 'http://example.com/', missing container information
    
    new CloudBlockBlob(new Uri("https://example.com/fdh/3746C9A2-533E-4544-A10B-321A8BC40AEA/sample-file.txt")).Exists()
    false
    
    new CloudBlockBlob(new Uri("http://stackoverflow.com/questions/43109843/ways-to-migrate-documents-pdf-forms-from-ms-sharepoint-to-aem")).Exists()
    StorageException: Blob type of the blob reference doesn't match blob type of the blob.
    

    So basically creating the CloudBlockBlob can fail, Exists can fail or return false.

    But it just feels wrong to do it like this.

  2. I could check the url to be https://<storage-account>.blob.core.windows.net/<container>/*, but I'm not sure if this is always correct.

If I could guarantee that 2. works I would do this because no HTTP request is necessary.

Are there other (better) ways to check if a random URL points to a blob in Azure (maybe something out of the (Azure Storage SDK) box)?

Upvotes: 4

Views: 4072

Answers (1)

Gaurav Mantri
Gaurav Mantri

Reputation: 136196

Not really an elegant solution (read, it's a hack basically :D)

One thing you could do is make a HEAD request to the URL. What you would need to do is parse the response headers. There are 3 scenarios that you would need to consider:

  1. URL points to a blob in blob storage and is publicly accessible.
  2. URL points to a blob in blob storage but is not publicly accessible (container is private).
  3. URL does not point to blob storage.

Now every request to Azure Blob Storage will have a response header called x-ms-request-id which contains a GUID like value. You can use this to differentiate between scenarios where URL points to blob storage and URL doesn't point to blob storage. Of course this check will fail (and that's why the solution is more of a "hack" if the URL actually returns this header [some random site decides to include this exact header in response]).

In case #1, you will get status code 200 back. Because the blob is publicly accessible you will get additional headers back. One of the headers there would be x-ms-blob-type and that could potentially have 3 values: BlockBlob, AppendBlob or PageBlob.

In case #2, you will get status code 404 back however you will receive x-ms-request-id response header back.

Upvotes: 2

Related Questions