Reputation: 36048
I have to parse a large file so instead of doing:
string unparsedFile = myStreamReader.ReadToEnd(); // takes 4 seconds
parse(unparsedFile); // takes another 4 seconds
I want to take advantage of the first 4 seconds and try to do both things at the same time by doing something like:
while (true)
{
char[] buffer = new char[1024];
var charsRead = sr.Read(buffer, 0, buffer.Length);
if (charsRead < 1)
break;
if (charsRead != 1024)
{
Console.Write("Here"); // debuger stops here several times why?
}
addChunkToQueue(buffer);
}
here is the image of the debuger: (I added int counter
to show on what iteration we read less than 1024 bytes)
Note that there where 643 chars read and not 1024. On the next iteration I get:
I think I should read 1024 bytes all the time until I get to the last iteration where the remeining bytes are less than 1024.
So my question is why will I read "random" number of chars as I iterate throw the while loop?
I don't know what kind of stream I am dealing with. I Execute a process like:
ProcessStartInfo psi = new ProcessStartInfo("someExe.exe")
{
RedirectStandardError = true,
RedirectStandardOutput = true,
UseShellExecute = false,
CreateNoWindow = true,
};
// execute command and return ouput of command
using (var proc = new Process())
{
proc.StartInfo = psi;
proc.Start();
var output = proc.StandardOutput; // <------------- this is where I get the strem
//if (string.IsNullOrEmpty(output))
//output = proc.StandardError.ReadToEnd();
return output;
}
}
Upvotes: 3
Views: 2180
Reputation: 8482
From the docs: http://msdn.microsoft.com/en-us/library/9kstw824
When using the Read method, it is more efficient to use a buffer that is the same size as the internal buffer of the stream, where the internal buffer is set to your desired block size, and to always read less than the block size. If the size of the internal buffer was unspecified when the stream was constructed, its default size is 4 kilobytes (4096 bytes). If you manipulate the position of the underlying stream after reading data into the buffer, the position of the underlying stream might not match the position of the internal buffer. To reset the internal buffer, call the DiscardBufferedData method; however, this method slows performance and should be called only when absolutely necessary.
So for the return value, the docs says:
The number of characters that have been read, or 0 if at the end of > the stream and no data was read. The number will be less than or equal to the count parameter, depending on whether the data is available within the stream.
Or, to summarize - your buffer and the underlying buffer are not the same size, thus you get partial fill of your buffer, as the underlying one is not being filled up yet.
Upvotes: 2
Reputation: 48230
It depends on the actual stream you are reading. If this is the file stream I guess it is rather unlikely to get "partial" data. However, if you read from a network stream, you have to expect the data to come in chunks of different length.
Upvotes: 3
Reputation: 1500525
For one thing, you're reading characters, not bytes. There's a huge difference.
As for why it doesn't necessarily read everything all at once: maybe there isn't that much data available, and StreamReader
has decided to give you what it's got rather than blocking for an indeterminate amount of time to fill your buffer. It's entirely within its rights to do so.
Is this coming from a local file, or over the network? Normally local file operations are much more likely to fill the buffer than network downloads, but either way you simply shouldn't rely on the buffer being filled. If it's a "file" (i.e. read using FileStream
) but it happens to be sitting on a network share... well, that's a grey area in my knowledge :) It's a stream - treat it that way.
Upvotes: 4