Reputation: 23
I am working on a project that requires opening large files (hundreds of GB, possibly TBs). I need to make changes to these files so my plan is to map the file instead of creating another file, reading the original, making changes, and then saving.
This is what I have for this idea:
hFile = CreateFile(filename, (GENERIC_READ | GENERIC_WRITE), 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE) {
return;
}
hFileMap = CreateFileMapping(hFile, NULL, PAGE_READWRITE, 0, 0, NULL);
if (hFileMap == NULL) {
CloseHandle(hFile);
return;
}
mapView = MapViewOfFile(hFileMap, FILE_MAP_ALL_ACCESS, 0, 0, amount);
if (mapView == NULL) {
CloseHandle(hFile);
CloseHandle(hFileMap);
return;
}
After reading more into MapViewOfFile
it seems that this is mapped in the programs virtual address space. For a 64 bit machine I am reading the max size is 2^64 bytes (16 exabytes). And for 32 bit it is 2GB.
If the 64 bit number is correct, I wouldn't need to do any sort of chunking of the file and creating multiple views. But on 32 bit if I come across a file that is large (>2GB) I would need to chunk it?
Is the amount also limited on RAM or HDD space?
Upvotes: 1
Views: 836
Reputation: 101666
The effective limit on 32-bit is much smaller than 2GB. How small is impossible to say. The problem is that the address space is shared with other memory allocations, special system pages and other loaded .DLLs.
For example, if your application uses the open/save common dialogs then this will load COM and whatever 3rd-party shell extensions you might have installed (NSEs, icon overlays, context menus etc) and these .DLLs might stay loaded for a while (some never unload) causing your address space to become fragmented and leaving no free large continuous range of usable addresses.
A way to mitigate this is to call VirtualAlloc
early in your program to reserve a large range, 500-1000MB perhaps. Release the reserved range just before you map the file.
If your files are always > 1GB and you want to support 32-bit then you must implement an alternative routine that maps in smaller chunks. For 64-bit, the address space is so large that you should have no issues with lack of free addresses.
Mapping files into memory is not free, it will require some free RAM for the page tables. This is probably architecture specific.
Upvotes: 3