C# efficient reading of stream content with a limit on amount read

Question

I have a case where a web API call returns a very large response of string. I make the call as follows:

var multipartContent = new MultipartFormDataContent();
multipartContent.Add(new ByteArrayContent(blobStream.CopyToBytes()), 
                         "upload", Path.GetFileName(fileName));

var response = await _httpClient.PostAsync("api/v1/textResponse", multipartContent);
int responeLength = response.Content.Headers.ContentLength.HasValue ? 
                    (int)response.Content.Headers.ContentLength.Value : -1;

response.EnsureSuccessStatusCode();

I only need to process the first 1Mb of data from the response, so if the response is less than 1Mb I will read all but if it's more I will hard stop my read at 1Mb.

I am looking for the most efficient way to do this read. I've tried this code:

// section above...

response.EnsureSuccessStatusCode();

string contentText = null;

if (responeLength < maxAllowedLimit) // 1Mb
{
     // less then limit - read all as string.
     contentText = await response.Content.ReadAsStringAsync();
} 
else {
     var contentStream = await response.Content.ReadAsStreamAsync();
     using (var stream = new MemoryStream())
     {
         byte[] buffer = new byte[5120]; // read in chunks of 5KB
         int bytesRead;
         while((bytesRead = contentStream.Read(buffer, 0, buffer.Length)) > 0)
         {
             stream.Write(buffer, 0, bytesRead);
         }
         contentText = stream.ConvertToString();
     }
}

Is this the most efficient way and how can I limit the amount read (the else). I've tried this code and it always returns an empty string. There is also:

ReadAsStringAsync()
ReadAsByteArrayAsync()
ReadAsStreamAsync()
LoadIntoBufferAsync(int size)

Are any of these methods more efficient?

Thanks in advance for any pointers!

canton7 · Accepted Answer

I suspect the most efficient (but still correct) way of doing this is probably something like this. This is made more complex by the fact that you have a limit on the number of bytes that are read, not the number of characters, and so we can't use a StreamReader. Note that we have to be careful not to stop reading in the middle of a codepoint - there are many cases where a single character is represented using multiple bytes, and stopping midway through would be an error.

const int bufferSize = 1024;
var bytes = new byte[bufferSize];
var chars = new char[Encoding.UTF8.GetMaxCharCount(bufferSize)];
var decoder = Encoding.UTF8.GetDecoder();
// We don't know how long the result will be in chars, but one byte per char is a
// reasonable first approximation. This will expand as necessary.
var result = new StringBuilder(maxAllowedLimit);
int totalReadBytes = 0;
using (var stream = await response.Content.ReadAsStreamAsync())
{
    while (totalReadBytes <= maxAllowedLimit)
    {
        int readBytes = await stream.ReadAsync(
            bytes,
            0,
            Math.Min(maxAllowedLimit - totalReadBytes, bytes.Length));

        // We reached the end of the stream
        if (readBytes == 0)
            break;

        totalReadBytes += readBytes;

        int readChars = decoder.GetChars(bytes, 0, readBytes, chars, 0);
        result.Append(chars, 0, readChars);
    }
}

Note that you'll probably want to use HttpCompletionOption.ResponseHeadersRead, otherwise HttpClient will go and download the whole body anyway.

If you're happy limiting by the number of characters, then life is easier:

string result;
using (var reader = new StreamReader(await response.Content.ReadAsStreamAsync()))
{
    char[] chars = new char[maxAllowedLimit];
    int read = reader.ReadBlock(chars, 0, chars.Length);
    result = new string(chars, 0, read);
}

C# efficient reading of stream content with a limit on amount read

Answers (1)

Related Questions