spender
spender

Reputation: 120480

Detect real start of HttpWebRequest

I'm a web crawler written using the TAP pattern API of HttpWebRequest.

I want to download some stuff from http://somedomain.tld but I might end up sending quite a number of requests. I don't know if somedomain.tld will respond in a timely fashion and I'd like to give it no more than 10s per request to complete sending back a response. I also want to take advantage of the connection limits enforced by the ServicePoint for that domain.

So I need to be able to time out on a request. Normally, I'd get a cancellation token from a CancellationTokenSource:

var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10d))

and supply that to my async operations, and perhaps also register a cancellation callback that calls myWebRequest.Abort(), so I end up with a (simplified) method that looks something like this:

public async Task<byte[]> GetResponseData(Uri uri, CancellationToken ct)
{
    var wr = (HttpWebRequest)WebRequest.Create(uri);
    ct.Register(wr.Abort);
    using(var response = await wr.GetResponseAsync())
    using(var ms = new MemoryStream())
    using(var responseStream = response.GetResponseStream())
    {
        await responseStream.CopyToAsync(ms,4096,ct);
        return ms.ToArray();
    }

}

So far, so good.

Let me constrain things a little:

var uri = new Uri("http://somedomain.tld");
var sp = ServicePointManager.FindServicePoint(uri);
sp.ConnectionLimit = 1;

Now, the ServicePoint instance associated with somedomain.tld will only allow a single request at a time.

Now I fire off two requests simultaneously, safe in the knowledge that the ServicePoint will insulate the target domain from my abuses:

var dataTasks = Enumerable.Range(0,2).Select(async _=>{
    using(var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10d)))
    {
        return await GetResponseData(uri,cts.Token);
    }
});

var datas = await Task.WhenAll(dataTasks);

Now, let's assume that the first request takes over 10s to complete... because I have constrained the ServicePoint to only fire off a single request at a time, by the time the ServicePoint gets round to firing off the second request, it's already been cancelled and aborted.

So how do I know when the request is actually being sent? How do I set a timeout that is "aware" of the actions of the ServicePoint with regards to a specific request?

Upvotes: 1

Views: 67

Answers (0)

Related Questions