Reputation: 53
I am currently working on a program that writes a lot of data (~100mb) to .txt-files. I googled a bit on how to optimize the different IO-Streams, but didn't really understand the following:
Here is the current state of my output:
for (int Row = 0; Row < RowCount; Row++)
{
using (System.IO.StreamWriter file =
new System.IO.StreamWriter(@"C:\Calculations\data1.txt", true))
{
file.WriteLine(Time + "\t" + Row + "\t" + Value1[Row]);
}
using (System.IO.StreamWriter file =
new System.IO.StreamWriter(@"C:\Calculations\data2.txt", true))
{
file.WriteLine(Time + "\t" + tRow + "\t" + Value2[Row]);
}
}
What i think this does:
If I'm correct with that, that would be a LOT of opening, closing and flushing for 100mb of data. I'd love to just open multiple streams at the start, write all the data while it's being calculated and then close and flush the streams.
Upvotes: 2
Views: 626
Reputation: 53
After some more research I found a solution for my first question. You can open multiple IO-Streams like this:
StreamWriter file1 = File.CreateText(@"C:\Outputtests\OutputTest1.txt");
StreamWriter file2 = File.CreateText(@"C:\Outputtests\OutputTest2.txt");
Like this you need to flush and close the streams yourself.
I compared the running times, it saves about a factor 10000 for my hdd:
static void Main(string[] args)
{
Stopwatch sw1 = new Stopwatch();
Stopwatch sw2 = new Stopwatch();
Stopwatch sw3 = new Stopwatch();
sw1.Start(); //first version: opens and closes file each time -> takes around 1 minute total
for (int i = 0; i < 10000; i++)
{
using (System.IO.StreamWriter file1 =
new System.IO.StreamWriter(@"C:\Outputtests\OutputTest1.txt", true))
{
file1.WriteLine(i);
}
}
sw1.Stop();
sw2.Start(); //second version: flushes each time -> takes around 50ms
StreamWriter file2 = File.CreateText(@"C:\Outputtests\OutputTest2.txt");
for (int i = 0; i < 10000; i++)
{
file2.WriteLine(i);
file2.Flush();
}
file2.Close();
sw2.Stop();
sw3.Start(); //third version: flushes at the end -> takes around 10ms
StreamWriter file3 = File.CreateText(@"C:\Outputtests\OutputTest3.txt");
for (int i = 0; i < 10000; i++)
{
file3.WriteLine(i);
}
file3.Flush();
file3.Close();
sw3.Stop();
Console.WriteLine("Output 1:\t" + sw1.ElapsedMilliseconds.ToString() + Environment.NewLine + "Output 2:\t" + sw2.ElapsedMilliseconds.ToString() + Environment.NewLine + "Output 3:\t" + sw3.ElapsedMilliseconds.ToString());
Console.ReadKey();
}
Not sure though how much you should store in the IO-Stream until you have to flush it.
Upvotes: 1
Reputation: 81593
There is a lot in your question
Some random points in no particular order
Further more.
Since buffering is taken careof by the stream, you don't need to worry about it, it will flush it when it needs to (according to the default or how you set the buffer). You don't need to close the file for it to flush. Once again, let the stream take care of it, or if you like you can flush it your self using 'Flush'
100mb is not a lot for a modern SSD drive, its can be done in milliseconds. And once again there is no need to open and close it if you don't need to. However on saying that, there is overhead opening and closing a file but its minimal, so its sometimes beneficial/aesy to wrap the access in a using
statement and read and write in on a peace-meal basis
And lastly, yes you can open multiple streams, but they are not thread safe. Meaning, since they only have 1 internal buffer you will have to either use a lock, or open and close the file to ensure its integrity. That's not saying multiple threads cant write to a file, its just saying its not as trivial as it aught to be.
If you want to keep the files open to read and write synchronously, just don't forget to close/dispose them when you have finished
Good luck
Upvotes: 3