Reputation: 29
I have to read binary data into char-arrays from large (2GB) binary files in a C++ program. When reading the files for the first time from my SSD, reading takes about 6.4 seconds per file. But when running the same code again or even after running a different dummy-program, which does almost the same before, the next readings take only about 1.4 seconds per file. The Windows Task Manager even shows much less disk-activity on the second, third, fourth… run. So, my guess is Window’s File Caching is sparing me from waiting for data from the SSD, when filling the arrays another time.
Is there any clean option to read the files into file cache before the customer runs the software? Any better option than just already loading the files with fread in advance? And how can I make sure, the data remains in the File Cache until I need it?
Or am I totally wrong with my File Cache assumption? Is there another (better) explanation for these different loading times?
Upvotes: 0
Views: 529
Reputation: 211
Educated guess here: You most likely are right with your file cache assumption.
Can you pre load files before the user runs the software? Not directly. How would your program be supposed to know that it is going to be run in the next few minutes?
So you probably need a helper mechanism or tricks. The options I see here are:
You can also try to combine the first two options. The key to a faster data availability is to figure out what to read in which order instead of trying to load everything at once en-bloc. Divide and Conquer.
Without further details on the problem it is impossible to provide more specific solutions though.
Upvotes: 2