Sebastian
Sebastian

Reputation: 972

MemoryMappedFile with very slow CreateViewStream

I'm using a memory mapped file that is approx. 100 GB of data. When I call CreateViewStream on that file it takes 30 minutes to create it and seems that it's because of the size of the memory mapped file but, why does take it so long? Does it copy the whole file into managed memory?

It takes much longer when I write the file with a file stream and access it without a reboot. (strangely)

Upvotes: 5

Views: 2590

Answers (2)

displayName
displayName

Reputation: 14379

This is a difficult one to answer without the code, knowledge of your main memory and architecture. Therefore I can only guess some important pointers:

  1. Do you have enough RAM? Straight off, if you refer to an address that has not yet been loaded into RAM, a page fault occurs behind the scenes and reads the data into RAM for you. Your program doesn’t notice this activity because your thread is suspended while the page fault is processed. Good article here.
  2. Another important point from the same article - You have no control over how much of the MMF is kept in memory or for how long. This means that using an MMF may push other things out of RAM, such as code or data pages that you will need back “soon”. Thereby resulting in slower execution. I especially want to point any person reading this answer to another answer here, so that we have a clear idea of how slow this slowness is in terms of processor cycles.
  3. Next, you are creating a stream. Streams good for sequential access while you might be trying to read/write to it randomly.

Regarding the end-to-end run time of your code in FileStream vs MMF approach, I think you should run the tests afresh because the running your first approach might result in a warmed up cache for the second one. The results won't be correct then.

According to the MSDN documentation of MMF,

Memory-mapped files enable programmers to work with extremely large files because memory can be managed concurrently, and they allow complete, random access to a file without the need for seeking.

The way MMF works is that the entire (or a portion) of the file is mapped as virtual memory, which is paged in and out of memory by the OS transparently as you access portions of the file. This is why MMFs are good for working with large files in the first place.

You can be smarter and read a part of the entire file and perform random access by making use of:

using (var accessor = mmf.CreateViewAccessor(offset, length))
{
    //Here you have access to a specific part of the file
}

so that you have access to a view with specified offset and size, of your mammoth file's memory-mapping.

Upvotes: 3

willaien
willaien

Reputation: 2797

I'm unable to replicate these issues. Here's the code I used to test:

    static void Main(string[] args)
    {
        var sw = Stopwatch.StartNew();
        var mmf = MemoryMappedFile.CreateFromFile(@"f:\test.bin");
        var stream = mmf.CreateViewStream();
        for (int i = 0; i < 100000; i++)
        {
            stream.ReadByte();
        }
        Console.WriteLine(sw.Elapsed);
    }

f:\test.bin is a 100GB zero filled file that I generated for the purposes of this test. I'm able to create the MemoryMappedFile, then run CreateViewStream and read 100,000 bytes from it in 3.7s.

Please provide sample code that's exhibiting the behavior you've described and I'll be glad to pick it apart and see what's going on.

Upvotes: 4

Related Questions