Reputation: 308
Is there a way to get the progress of the ReadAsStringAsync()
method?
I am just getting the HTML content of a website and parsing.
public static async Task<returnType> GetStartup(string url = "http://")
{
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Add("User-Agent",
"Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko");
using (HttpResponseMessage response = await client.GetAsync(url))
{
using (HttpContent content = response.Content)
{
string result = await content.ReadAsStringAsync();
}
}
}
}
Upvotes: 4
Views: 2661
Reputation: 155055
Is there a way to get the progress of the
ReadAsStringAsync()
method? I am just getting the html content of a website and parsing.
Yes and no.
HttpClient
does not expose timing and progress information from the underlying network-stack, but you can get some information out by using HttpCompletionOption.ResponseHeadersRead
, the Content-Length
header, and reading the response yourself with your own StreamReader
(asynchronously, of course).
Do note that the Content-Length
in the response headers will refer to the length of the compressed content prior to decompression, not the original content length, which complicates things because probably most web-servers today will serve HTML (and static content) with gzip
compression (as either Content-Encoding
or Transfer-Encoding
), so the Content-Length
header will not tell you the length of the decompressed content. Unfortunately, while HttpClient
can do automatic GZip decompression for you, it won't tell you what the decompressed content length is.
But you can still report some kinds of progress back to your method's consumer, see below for an example. You should do this using the .NET idiomatic IProgress<T>
interface rather than rolling your own.
Like so:
private static readonly HttpClient _hc = new HttpClient()
{
DefaultRequestHeaders =
{
{ "User-Agent", "Mozilla/5.0 (compatible, MSIE 11, Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko" }
}
// NOTE: Automatic Decompression is not enabled in this HttpClient so that Content-Length can be safely used. But this will drastically slow down content downloads.
};
public static async Task<T> GetStartupAsync( IProgress<String> progress, string url = "http://")
{
progress.Report( "Now making HTTP request..." );
using( HttpResponseMessage response = await client.GetAsync( url, HttpCompletionOption.ResponseHeadersRead ) )
{
progress.Report( "Received HTTP response. Now reading response content..." );
Int64? responseLength = response.Content.Headers.ContentLength;
if( responseLength.HasValue )
{
using( Stream responseStream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false) )
using( StreamReader rdr = new StreamReader( responseStream ) )
{
Int64 totalBytesRead = 0;
StringBuilder sb = new StringBuilder( capacity: responseLength.Value ); // Note that `capacity` is in 16-bit UTF-16 chars, but responseLength is in bytes, though assuming UTF-8 it evens-out.
Char[] charBuffer = new Char[4096];
while( true )
{
Int32 read = await rdr.ReadAsync( charBuffer ).ConfigureAwait(false);
sb.Append( charBuffer, 0, read );
if( read === 0 )
{
// Reached end.
progress.Report( "Finished reading response content." );
break;
}
else
{
progress.Report( String.Format( CultureInfo.CurrentCulture, "Read {0:N0} / {1:N0} chars (or bytes).", sb.Length, resposneLength.Value );
}
}
}
}
else
{
progress.Report( "No Content-Length header in response. Will read response until EOF." );
string result = await content.ReadAsStringAsync();
}
progress.Report( "Finished reading response content." );
}
Notes:
async
method or method returning a Task
/Task<T>
should be named with an Async
suffix, so your method should be named GetStartupAsync
, not GetStartup
.IHttpClientFactory
available, you should not wrap a HttpClient
in a using
block because this can cause system resource exhaustion, especially in server application.
HttpClient
doesn't have this problem), but I won't go into details here).HttpClient
. This is one of the few exceptions to the rule about always disposing of any IDisposable
objects that you create or own.HttpClient
is thread-safe and this is a static
method consider using a cached static instance instead.HttpResponseMessage.Content
in a using
block either, as the Content
object is owned by the HttpResponseMessage
.Upvotes: 6