Reputation: 167
I ran my code through a profiler and saw most of the time (60%) is spent reading files. It only takes a few milliseconds to run but I was wondering if I can make it any faster. My code has a list of 20 files (from 1k to 1M). It opens one, reads the entire file into ram, process it (sequential, reads everything once), then repeats open/read/process/close for the rest of the files reusing the same buffer
I was wondering if there's a way to make anything faster? I tried using posix_fadvise with POSIX_FADV_SEQUENTIAL and POSIX_FADV_WILLNEED, using the file offset len as 0, 0
and 0, st_size
. It didn't seem to make a difference. I haven't yet written code to open all the files before reading. Would that make a difference? Should I be using posix_fadvise on all of them? Should I be using POSIX_FADV_SEQUENTIAL or POSIX_FADV_WILLNEED?
Upvotes: 0
Views: 1160
Reputation: 12708
In my opinion, this code, extracted from the first edition of "The C programming language" its quite difficult to superate:
#include <stdio.h>
main()
{
int c;
while((c = getchar()) != EOF)
putchar(c);
}
Upvotes: 0
Reputation: 155724
fadvise
only really helps if:
If you're just slurping the whole file into RAM up front immediately after opening it, there's not much to be optimized; the file has to be read from beginning to end, and you haven't given the OS enough warning to cache it. Things to consider:
fadvise
ing file n+1 just before you begin reading from file n (so the OS is caching the next file in while you're processing the current file)mmap
+madvise(WILLNEED)
to avoid the need to copy the file from kernel to user buffers all at once before you can begin processing; if the processing of the file is expensive enough, the subsequent pages may be read in by the time you've finished processing the early pages in the file.Given these are small files, I'd just stick with WILLNEED
; SEQUENTIAL
enlarges the read-ahead buffer, but you're going to read the whole file anyway (possibly in bulk, where SEQUENTIAL
won't help much) so you may as well cache the whole thing in as quickly as possible.
Upvotes: 4