Reputation: 1246
Im having an interesting problem with reading a large file (~400 mb) that's on a network drive. Orginally, I fed the full network address into a FileInfo and used the CopyTo function to transfer it to a local temp drive and then read it. This seems to work okay, its not slow but its not fast - just meh. The CopyTo function would get the computer running the program's network utilization consistantly up above 50%, which is pretty good.
In order to speed up the process I tried to read the network file directly into a Memory Stream to cut out the middle man so to speak. When I tried this (using the asynchronous copy pattern described here), it is hilariously slow. My network utilization never even tops 2% - its almost like something is throttling me. FYI, I watched my network utilization when directly copying the same file via windows explorer and it hit like 80-90%... not sure what's happening here. Below is the asynchronous copy code I used:
string line;
List<string> results = new List<string>();
Parser parser = new Parser(QuerySettings.SelectedFilters, QuerySettings.SearchTerms,
QuerySettings.ExcludedTerms, QuerySettings.HighlightedTerms);
byte[] ActiveBuffer = new byte[60 * 1024];
byte[] BackBuffer = new byte[60 * 1024];
byte[] WriteBuffer = new byte[60 * 1024];
MemoryStream memStream = new MemoryStream();
FileStream fileStream = new FileStream(fullPath, FileMode.Open, FileSystemRights.Read, FileShare.None, 60 * 1024, FileOptions.SequentialScan);
int Readed = 0;
IAsyncResult ReadResult;
IAsyncResult WriteResult;
ReadResult = fileStream.BeginRead(ActiveBuffer, 0, ActiveBuffer.Length, null, null);
do
{
Readed = fileStream.EndRead(ReadResult);
WriteResult = memStream.BeginWrite(ActiveBuffer, 0, Readed, null, null);
WriteBuffer = ActiveBuffer;
if (Readed > 0)
{
ReadResult = fileStream.BeginRead(BackBuffer, 0, BackBuffer.Length, null, null);
BackBuffer = Interlocked.Exchange(ref ActiveBuffer, BackBuffer);
}
memStream.EndWrite(WriteResult);
}
while (Readed > 0);
StreamReader streamReader = new StreamReader(memStream);
while ((line = streamReader.ReadLine()) != null)
{
if (parser.ParseResults(line))
results.Add(line);
}
fileStream.Flush();
fileStream.Close();
memStream.Flush();
memStream.Close();
return results;
UPDATE As per the comments I just tried the following. It only had my network utilization at about 10-15%... why so low?
MemoryStream memStream = new MemoryStream();
FileStream fileStream = File.OpenRead(fullPath);
fileStream.CopyTo(memStream);
memStream.Seek(0, 0);
StreamReader streamReader = new StreamReader(memStream);
Parser parser = new Parser(QuerySettings.SelectedFilters, QuerySettings.SearchTerms,
QuerySettings.ExcludedTerms, QuerySettings.HighlightedTerms);
while ((line = streamReader.ReadLine()) != null)
{
if (parser.ParseResults(line))
results.Add(line);
}
Upvotes: 3
Views: 5856
Reputation: 2942
I'm late to the party, but having had the same problem of low network utilization recently, after trying a lot of different implementations if found at last that a StreamReader with a large buffer (1MB in my case) increased the network utilization to 99%. None of the other options did make a significant change.
Upvotes: 5
Reputation: 776
Using Reflector, I see that your call to:
FileStream fileStream = File.OpenRead(fullPath);
ends up using a buffer of size 4096 bytes ( 0x1000 ).
public FileStream(string path, FileMode mode, FileAccess access, FileShare share) : this(path, mode, access, share, 0x1000, FileOptions.None, Path.GetFileName(path), false)
{
}
You could try calling one of the FileStream constructors explicitly, and specify a much larger buffer size and FileOption.SequentialScan.
Not sure this will help, but it is easy to try.
Upvotes: 1
Reputation: 1694
There is no point copying the whole file over and then parsing it. Simply open the file from the network drive and let the .Net Framework do it's best to deliver the data for you. You can be more clever than MS developers and you may create a copy method faster than they do, but it's really a challenge.
Upvotes: 1