Reputation: 57
I'm trying to get some page's status code.
Problem is default GetAsync method returns the whole page with content, while I only needed the header to check page's status(404,403, etc..), which will end up as hogging up the memory since I have to check tons of URIs.
I added ResponseHeadersRead option to solve that memory hogging issue, but then that code started to throw "A Task was cancelled" Exception, which means timeout.
Things that i know :
ResponseHeadersRead code ONLY works, when i runs fiddler(Http/Https Debugger) on my local PC.
ResponseHeadersRead code works at online-coding environment, like dotnetfiddle. but doesnt work on Windows OS Environment.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Net;
using System.Security.Cryptography;
public class Program
{
public static string[] Tags = { "first", "second" };
public static string prefix = null;
static HttpClient Client = new HttpClient();
public static void Main()
{
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
Client.DefaultRequestHeaders.ConnectionClose = true;
// limit parallel thread
Parallel.ForEach(Tags,
new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) },
tag =>
{
for (int i = 1; i < 4; i++)
{
switch (i)
{
case 1:
prefix = "1";
break;
case 2:
prefix = "2";
break;
case 3:
prefix = "3";
break;
}
Console.WriteLine(tag.ToString() + " and " + i);
HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix).Result; // this works
// HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix,HttpCompletionOption.ResponseHeadersRead).Result; // this fails from 2nd try with one url.
Console.WriteLine(i + " and " + (int)response.StatusCode);
if (response.StatusCode != HttpStatusCode.NotFound)
{
}
}
});
}
}
It gets thread timeout with ResponseHeadersRead, while it isn't without it.
Upvotes: 1
Views: 7934
Reputation: 5472
Don't use Parallel
for async
code, it is intended for CPU bound. You can run all the requests concurrently without wasting threads blocking on it. The way to solve this issue is not to increase DefaultConnectionLimit
, however, this will solve it in this case. The correct way to deal with ResponseHeadersRead
is to either Dispose
the response
i.e.
using(HttpResponseMessage response = Client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead).Result) {}
or to read the Content
of the response.
var data = response.ReadAsStringAsync().Result;
With ResponseHeadersRead
, you need to do this in order for the connection to be closed. I would encourage you to rewrite this code to get rid of Parallel
and not to call .Result
on your async
calls.
You can do something like this:
private static async Task Go()
{
System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12 | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls;
Client.DefaultRequestHeaders.ConnectionClose = true;
var tasks = Tags.Select(tag =>
{
var requests = new List<Task>();
for (int i = 1; i < 4; i++)
{
switch (i)
{
case 1:
prefix = "1";
break;
case 2:
prefix = "2";
break;
case 3:
prefix = "3";
break;
}
requests.Add(MakeRequest(Client, prefix, tag));
}
return requests;
}).SelectMany(t => t);
await Task.WhenAll(tasks);
}
private async static Task MakeRequest(HttpClient client, string prefix, string tag)
{
using (var response = await client.GetAsync("https://example.com/" + prefix, HttpCompletionOption.ResponseHeadersRead))
{
Console.WriteLine(tag + " and " + prefix);
Console.WriteLine(prefix + " and " + (int)response.StatusCode);
}
}
Upvotes: 3
Reputation: 88996
When you set ResponseHeadersRead, you are instructing the HTTP client to read only the HTTP headers from each response, and so the TCP/IP connection on which the request is made is in the middle of the response until you read the response body.
And there's a limit to how many connections HttpClient will open to any particular web site. This defaults to 2. So you open two connections, and try to open a third, which blocks waiting for an available connection.
You can simply increase the connection limit for your application.
eg:
ServicePointManager.DefaultConnectionLimit = 10;
Upvotes: 2