Reputation: 1414
Good afternoon, We are building a prototype of a Windows/LINUX deduper using the memory mapped file API of Windows and Linux. Our deduper starts out by doing a sequential scan of all the database records to be deduped. Therefore , we pass the flag FILE_FLAG_SEQUENTIAL_SCAN to the Windows API CreateFile during our intial sequential scan of the database records to be deduped. Once we finish the first part of our deduping process, we try to use the Windows memory mapping API to randomly access the data. At this point, using the Windows C++ API, is it possible to dynamically change to the FILE_FLAG_RANDOM_ACCESS mode?
In Linux, we are are able to do this with the following excerpt of code,
MapPtr = (char*)mmap((void *)BaseMapPtr ,mappedlength,PROT_READ,
MAP_PRIVATE, hFile,baseoff );
if (MapPtr == MAP_FAILED){
perror("mmap");
throw cException(ERR_MEMORYMAPPING,TempFileName);
}
madvise(MapPtr,mappedlength,MADV_RANDOM);
Are we paying a penalty in Windows by using FILE_FLAG_SEQUENTIAL_SCAN during the random access phase of our deduping process. Thank you.
Upvotes: 3
Views: 1530
Reputation: 7164
Just to back up @Burkes answer: as you mentioned that you were "using the memory mapped file API of Windows" it should be noted that Raymond Chen warns the cache hints have no affect on effect on memory mapped I/O:
Note: These cache hints apply only if you use ReadFile (or moral equivalents). Memory-mapped file access does not go through the cache manager, and consequently these cache hints have no effect.
So what happens to already be cached may help but future memory mapped accesses will not help the cache be populated/depopulated.
Upvotes: 0
Reputation: 3718
The caching hint flags passed to CreateFile() do not affect the manner in which the memory manager satisfies page faults generated by de-referencing an address within a mapped section. Such I/Os use the same - they use the same cache pages as regular I/O.
That said, when a handle to the file is created with FILE_FLAG_SEQUENTIAL_SCAN, the cache manager may perform read-ahead operations (and may even read the entire file into memory, if system conditions allow for this). Which means that that you may encounter fewer hard page faults if you sequentially access the pages of the mapped file.
it seems to me that you'd be best served by simply re-using the handle you've created during your de-dup processing. Closing and re-opening may cause previously cached pages of the file to be discarded, depending on cache pressure.
Upvotes: 5
Reputation: 308452
A description of how FILE_FLAG_SEQUENTIAL_SCAN works can be found here: http://support.microsoft.com/kb/98756
As it is only used with the CreateFile function, there's no way to update it once the file is opened. You may always close the file and reopen it with a different flag.
Upvotes: 3