아아니으
아아니으

Reputation: 57

HttpClient with ResponseHeadersRead fails(timeouts) at 2nd GetAsync try without Fiddler(Http/Https debugger)

I'm trying to get some page's status code.

Problem is default GetAsync method returns the whole page with content, while I only needed the header to check page's status(404,403, etc..), which will end up as hogging up the memory since I have to check tons of URIs.

I added ResponseHeadersRead option to solve that memory hogging issue, but then that code started to throw "A Task was cancelled" Exception, which means timeout.

Things that i know :

  1. ResponseHeadersRead code ONLY works, when i runs fiddler(Http/Https Debugger) on my local PC.

  2. ResponseHeadersRead code works at online-coding environment, like dotnetfiddle. but doesnt work on Windows OS Environment.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Net;
using System.Security.Cryptography;


public class Program
{
    public static string[] Tags = { "first", "second" };
    public static string prefix = null;
    static HttpClient Client = new HttpClient();
    public static void Main()
    {
        System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
        Client.DefaultRequestHeaders.ConnectionClose = true;

        // limit parallel thread
        Parallel.ForEach(Tags,
        new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) },
        tag =>
        {
            for (int i = 1; i < 4; i++)
            {
                switch (i)
                {
                    case 1:
                        prefix = "1";
                        break;
                    case 2:
                        prefix = "2";
                        break;
                    case 3:
                        prefix = "3";
                        break;
                }
                Console.WriteLine(tag.ToString() + " and " + i);
                HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix).Result; // this works
//                HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix,HttpCompletionOption.ResponseHeadersRead).Result; // this fails from 2nd try with one url.
                Console.WriteLine(i + " and " + (int)response.StatusCode);
                if (response.StatusCode != HttpStatusCode.NotFound)
                {

                }

            }
        });

    }
}

It gets thread timeout with ResponseHeadersRead, while it isn't without it.

enter image description here

Upvotes: 1

Views: 7934

Answers (2)

JohanP
JohanP

Reputation: 5472

Don't use Parallel for async code, it is intended for CPU bound. You can run all the requests concurrently without wasting threads blocking on it. The way to solve this issue is not to increase DefaultConnectionLimit, however, this will solve it in this case. The correct way to deal with ResponseHeadersRead is to either Dispose the response i.e.

using(HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead).Result) {}

or to read the Content of the response.

var data = response.ReadAsStringAsync().Result;

With ResponseHeadersRead, you need to do this in order for the connection to be closed. I would encourage you to rewrite this code to get rid of Parallel and not to call .Result on your async calls.

You can do something like this:

private static async Task Go()
{
    System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
    Client.DefaultRequestHeaders.ConnectionClose = true;

    var tasks = Tags.Select(tag =>
    {
        var requests = new List<Task>();
        for (int i = 1; i < 4; i++)
        {
            switch (i)
            {
                case 1:
                    prefix = "1";
                    break;
                case 2:
                    prefix = "2";
                    break;
                case 3:
                    prefix = "3";
                    break;
            }

            requests.Add(MakeRequest(Client, prefix, tag));
        }
        return requests;
    }).SelectMany(t => t);

    await Task.WhenAll(tasks);
}

private async static Task MakeRequest(HttpClient client, string prefix, string tag)
{

    using (var response = await client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead))
    {
        Console.WriteLine(tag + " and " + prefix);
        Console.WriteLine(prefix + " and " + (int)response.StatusCode);
    }
}

Upvotes: 3

David Browne - Microsoft
David Browne - Microsoft

Reputation: 88996

When you set ResponseHeadersRead, you are instructing the HTTP client to read only the HTTP headers from each response, and so the TCP/IP connection on which the request is made is in the middle of the response until you read the response body.

And there's a limit to how many connections HttpClient will open to any particular web site. This defaults to 2. So you open two connections, and try to open a third, which blocks waiting for an available connection.

You can simply increase the connection limit for your application.

eg:

ServicePointManager.DefaultConnectionLimit = 10;

Upvotes: 2

Related Questions