Reputation: 6085
My program uses HttpClient
to send a GET request to a Web API, and this returns a file.
I now use this code (simplified) to store the file to disc:
public async Task<bool> DownloadFile()
{
var client = new HttpClient();
var uri = new Uri("http://somedomain.com/path");
var response = await client.GetAsync(uri);
if (response.IsSuccessStatusCode)
{
var fileName = response.Content.Headers.ContentDisposition.FileName;
using (var fs = new FileStream(@"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
await response.Content.CopyToAsync(fs);
return true;
}
}
return false;
}
Now, when this code runs, the process loads all of the file into memory. I actually would rather expect the stream gets streamed from the HttpResponseMessage.Content
to the FileStream
, so that only a small portion of it is held in memory.
We are planning to use that on large files (> 1GB), so is there a way to achieve that without having all of the file in memory?
Ideally without manually looping through reading a portion to a byte[]
and writing that portion to the file stream until all of the content is written?
Upvotes: 22
Views: 41496
Reputation: 884
Another simple and quick way to do it is:
public async Task<bool> DownloadFile(string url)
{
using (MemoryStream ms = new MemoryStream()) {
new HttpClient().GetStreamAsync(webPath).Result.CopyTo(ms);
... // use ms in what you want
}
}
now you have the file downloaded as stream in ms.
Upvotes: -3
Reputation: 446
Instead of GetAsync(Uri)
use the the GetAsync(Uri, HttpCompletionOption)
overload with the HttpCompletionOption.ResponseHeadersRead
value.
The same applies to SendAsync
and other methods of HttpClient
Sources:
The returned Task object will complete based on the completionOption parameter after the part or all of the response (including content) is read.
.NET Core implementation of GetStreamAsync
that uses HttpCompletionOption.ResponseHeadersRead
https://github.com/dotnet/corefx/blob/release/1.1.0/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L163-L168
ResponseHeadersRead
is what does the trick)Upvotes: 20
Reputation: 23731
It looks like this is by-design - if you check the documentation for HttpClient.GetAsync()
you'll see it says:
The returned task object will complete after the whole response (including content) is read
You can instead use HttpClient.GetStreamAsync()
which specifically states:
This method does not buffer the stream.
However you don't then get access to the headers in the response as far as I can see. Since that's presumably a requirement (as you're getting the file name from the headers), then you may want to use HttpWebRequest
instead which allows you you to get the response details (headers etc.) without reading the whole response into memory. Something like:
public async Task<bool> DownloadFile()
{
var uri = new Uri("http://somedomain.com/path");
var request = WebRequest.CreateHttp(uri);
var response = await request.GetResponseAsync();
ContentDispositionHeaderValue contentDisposition;
var fileName = ContentDispositionHeaderValue.TryParse(response.Headers["Content-Disposition"], out contentDisposition)
? contentDisposition.FileName
: "noname.dat";
using (var fs = new FileStream(@"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
await response.GetResponseStream().CopyToAsync(fs);
}
return true
}
Note that if the request returns an unsuccessful response code an exception will be thrown, so you may wish to wrap in a try..catch
and return false
in this case as in your original example.
Upvotes: 21