DevonS
DevonS

Reputation: 63

Verify an image exists at a URL when HEAD is not allowed

Using HttpClient in C#, I'm trying to verify that an image exists at a given URL without downloading the actual image. These images can come from any publicly accessible address. I've been using the HEAD HTTP verb which seems to work for many/most. Google drive images are proving difficult.

Given a public share link like so: https://drive.google.com/file/d/1oCmOEJp0vk73uYhzDTr2QJeKZOkyIm6v/view?usp=sharing I can happily use HEAD, get a 200 OK and it appears to be happy. But, that's not the image. It's a page where one can download the image.

With a bit of mucking around, you can change the URL to this to actually get at the image, which is what I really want to check: https://drive.google.com/uc?export=download&id=1oCmOEJp0vk73uYhzDTr2QJeKZOkyIm6v

But, hitting that URL with HEAD results in a 405 MethodNotAllowed Luckily, if the URL truly doesn't exist, you get back a 404 NotFound

So I'm stuck at the 405. What is my next step (NOT using Google APIs) when HEAD is not allowed? I can't assume it's a valid image if it simply doesn't 404. I check the Content-type to verify it's an image, which has issues outside the scope of this question.

Upvotes: 2

Views: 714

Answers (2)

Peter Csala
Peter Csala

Reputation: 22829

HttpClient allows us to issue an http request where you can specify that you are interested about only the headers.

The trick is to pass an HttpCompletionOption enum value to the SendAsync or any other {HttpVerb}Async method:

Enum name Value Description
ResponseContentRead 0 The operation should complete after reading the entire response including the content.
ResponseHeadersRead 1 The operation should complete as soon as a response is available and headers are read. The content is not read yet.
await client.GetAsync(targetUrlWhichDoesNotSupportHead, HttpCompletionOption.ResponseHeadersRead);

Here is an in-depth article that details how does this enum changes the behavior and performance of the HttpClient.

The related source code fragments:

Upvotes: 3

DevonS
DevonS

Reputation: 63

Brilliant, Peter! Thank you.

Here's my full method for anyone who may find it useful:

public async Task<bool> ImageExists(string urlOrPath)
{
    try
    {
        var uri = new Uri(urlOrPath);
        if (uri.IsFile)
        {
            if (File.Exists(urlOrPath)) return true;
            _logger.LogError($"Cannot find image: [{urlOrPath}]");
            return false;
        }

        using (var result = await Get(uri))
        {
            if (result.StatusCode == HttpStatusCode.NotFound)
            {
                _logger.LogError($"Cannot find image: [{urlOrPath}]");
                return false;
            }
            if ((int)result.StatusCode >= 400)
            {
                _logger.LogError($"Error: {result.ReasonPhrase}. Image: [{urlOrPath}]");
                return false;
            }
            if (result.Content.Headers.ContentType == null)
            {
                _logger.LogError($"No 'ContentType' header returned.  Cannot validate image:[{urlOrPath}]");
                return false;
            }
            if(new[] { "image", "binary"}.All(v => !result.Content.Headers.ContentType.MediaType.SafeTrim().Contains(v)))
            {
                _logger.LogError($"'ContentType' {result.Content.Headers.ContentType.MediaType} is not an image. The Url may point to an HTML download page instead of an actual image:[{urlOrPath}]");
                return false;
            }
            var validTypes = new[] { "jpg", "jpeg", "gif", "png", "bmp", "binary" }; 
            if(validTypes.All(v => !result.Content.Headers.ContentType.MediaType.SafeTrim().Contains(v)))
            {
                _logger.LogError($"'ContentType' {result.Content.Headers.ContentType.MediaType} is not a valid image. Only [{string.Join(", ", validTypes)}] accepted. Image:[{urlOrPath}]");
                return false;
            }

            return true;
        }
    }
    catch (Exception e)
    {
        _logger.LogError($"There was a problem checking the image: [{urlOrPath}] is not valid. Error: {e.Message}");
        return false;
    }
}
private async Task<HttpResponseMessage> Get(Uri uri)
{
    var response = await _httpCli.SendAsync(new HttpRequestMessage(HttpMethod.Head, uri));
    if (response.StatusCode != HttpStatusCode.MethodNotAllowed) return response;

    return await _httpCli.SendAsync(new HttpRequestMessage() { RequestUri = uri }, HttpCompletionOption.ResponseHeadersRead);
}

Edit: added a Get() method which still uses HEAD and only uses ResponseHeadersRead if it encounters MethodNotAllowed. Using a live scenario I found it was much quicker. Not sure why. YMMV

Upvotes: 1

Related Questions