Reputation:
I've got a little program that reads and writes files on disk. Breaking it down to the most simple level, it reads bytes from one file stream and writes them to another. It performs its duties fine, but it isn't the fastest thing.
I've seen other applications that can tear through a gigabyte or more of reads/writes in amazing speeds. Obviously they're operating closer to the metal than a little .NET app.
What are the most efficient .NET APIs for streaming to/from the disk? What win32 APIs are available (and worth p/invoking for) for speedy disk access?
Upvotes: 6
Views: 8337
Reputation: 9575
.NET file support is fast enough (comparable to native Win32 functions). Several options that can help you improve your performance:
Upvotes: 7
Reputation:
BinaryReader
and BinaryWriter
with a suitable buffer size are pretty fast. If you are reading into structures, the unsafe approach described in this article will get you reading fast, and writing is similar. I also agree with the suggestion to double-check that I/O is really the bottleneck. I first came across that article due to such a mistake.
Upvotes: 0
Reputation:
Have you profiled your application to determine if the disk I/O was the bottleneck?
What type of hardware are you running this on? What is the hardware configuration?
In .NET you may try the System.IO.File
namespace.
For Win32 functions you may try the CreateFile, WriteFile, ReadFile series.
An example:
http://msdn.microsoft.com/en-us/library/bb540534(VS.85).aspx
This is definitely not cut and dried. It's all about testing and measuring.
Upvotes: 0
Reputation: 131716
Fast file I/O is less about the specific API calls you make, but rather about how you architect your application to work with I/O.
If you are performing all of your I/O operations on a single thread in a sequential manner, for example
you are bottlenecking the system's I/O bandwidth in the processing loop of a single thread. An alternative, but more complicated design is to multithread your application to maximize throughput and avoid wait time. This allows the system to take advantage of both CPU and I/O controller bandwidth simultaneously. A typical design for this would look something like:
This is not an easy architecture to design right, and requires quite a bit of thought to avoid creating in-memory lock contention, or overwhelm the system with concurrent I/O requests. You also need to provide control metadata so that the state of output processing is not managed on the call stack of a thread but rather in the input/output work queues. You also have to make sure that you transform and write the output in the correct order, since with multi-threaded I/O you can't be sure work is placed on the input queue in a guaranteed order. It's complicated - but it is possible, and it can have a dramatic difference in throughput over a serial approach.
If you really have the time and want to squeeze every ounce of performance from the system, you could also use I/O completion ports - a relatively low-level API - to maximize throughput.
Good luck.
Upvotes: 12