Reputation: 159
I have the following request flow where the customer can request to download a CSV file from the Server. The issue is that the blob file is too large and the customer has to wait a lot longer before the actual download starts (the customer thinks that there is some issue and closes the browser). How can the download be made more efficient using streams?
Current sequence is as below:
Request Sequence:
Now the issue is that while using the DownloadTo(Stream) function of BlobBaseClient, the file is entirely downloaded to memory before I could do anything.
How can I download the blob file in chunks, do the processing and start sending it to the customer?
Part of Download Controller:
var contentDisposition = new ContentDispositionHeaderValue("attachment")
{
FileName = "customer-file.csv",
CreationDate = DateTimeOffset.UtcNow
};
Response.Headers.Add("Content-Disposition", contentDisposition.ToString());
var result = blobService.DownloadAndProcessContent();
foreach (var line in result)
{
yield return line ;
}
Response.BodyWriter.FlushAsync();
Part of DownloadAndProcessContent Function:
var stream = new MemoryStream();
var blob = container.GetAppendBlobClient(blobName);
blob.DownloadTo(stream);
// Processing is done on the Blob Data
var streamReader = new StreamReader(stream);
while (!streamReader.EndOfStream)
{
string currentLine= streamReader.ReadLine();
// process the line.
string processDataLine = ProcessData(currentLine);
yield return processDataLine;
}
Upvotes: 1
Views: 3474
Reputation: 591
Did you consider using built-in method OpenRead so you can apply the StreamReader
directly to the blob stream without needing a MemoryStream
in the middle? This should give you a way process line-by-line as you do in the loop.
Also note it's recommended to take the async-await approach all the way so your controller code (made async) would be much more scalable by not blocking on I/O turning the .Net thread-pool into a bottleneck for handling concurrent requests to your API.
This answer doesn't address returning an HTTP response with streaming, that's separate from streaming a downloaded blob.
Upvotes: 1