Reputation: 11
I am currently working on a problem which requires me to store a large amount of well structured information in a file. It is more data than I can keep in memory, but I need to access different parts of it very often and would like to do so as quickly as possible (of course). Unfortunately, the file would be large enough that actually reading through it would take quite some time as well.
From what I have gathered so far, it seems to me that ACCESS="DIRECT" would be a good way of handling this problem. Do I understand correctly that with direct access, I am basically pointing at a specific chunk of memory and ask "What's in there?"? And do I correctly infer from that, that reading time does not depend on the overall file size?
Thank you very much in advance!
Upvotes: 1
Views: 128
Reputation: 37208
You can think of an ACCESS='DIRECT'
file as a file consisting of a number of fixed size records. You can do operations like read or write record #N in O(1)
time. That is, in order to access record #N you don't need to scan through all the preceding #M (M<N) records in the file.
If this maps reasonably well to the problem you're trying to solve, then ACCESS='DIRECT'
might be the correct solution in your case. If not, ACCESS='STREAM'
offers a little bit more flexibility in that the size of each record does not need to be fixed, though you need to be able to compute the correct file offset yourself. If you need even more flexibility there's things like NetCDF, or HDF5 like @HighPerformanceMark suggested, or even things like sqlite.
Upvotes: 2