Reputation: 1651
I was wondering what is better in this case?
I have to read in thousands of files. I was thinking of opening into each file and reading one and closing it. Or cat all the files into one file and read that.
Suggestions? This is all in Perl.
Upvotes: 0
Views: 362
Reputation: 186
Note that cat *
can fail if number of files is greater than your ulimit -n
value. So sequential read can actually be safer.
Also, consider using opendir
and readdir
instead of glob
if all your files are located in the same dir.
Upvotes: 1
Reputation: 15381
If the time for cat
ing all files into one bigger file doesn't matter it will be faster (only when reading the file sequentially which is the default).
Of course if the process is taken into account it'll be much slower because you have to read, write and read again.
In general reading one file of 1000M should be faster than reading 100 files of 10M because for the 100 files you'll need to look for the metadata.
As tchrist says the performance difference might not be important. I think it depends on the type of file (e.g. for a huge number of files which are very small it would differ much more) and the overall performance of your system and its storage.
Upvotes: 2
Reputation: 37506
Just read the files sequentially. Perl's file i/o functions are pretty thin wrappers around native file i/o calls in the OS, so there isn't much point in fretting about performance from simple file i/o.
Upvotes: 0
Reputation: 80384
It shouldn't make that much of a difference. This sounds like premature optimization to me.
Upvotes: 6