Andy
Andy

Reputation: 11482

alternative to MemoryStream for large data volumes

I'm having problems with out of memory exceptions when using a .Net MemoryStream if the data is large and the process is 32 bit.

I believe that the System.IO.Packaging API silently switches from memory to to file-backed storage as the data volume increases, and on the face of it, it seems it would be possible to implement a subclass of MemoryStream that does exactly the same thing.

Does anyone know of such an implementation? I'm pretty sure there is nothing in the framework itself.

Upvotes: 4

Views: 15581

Answers (4)

Steve M
Steve M

Reputation: 629

We've run into similar obstacles on my team. Some commenters have suggested that developers need to be more okay with using files. If it's an option to use the filesystem directly do that, but that's not always an option.

If, like we needed, you want to pass data read from a file around your application, you can't pass the FileStream object because it can get disposed before you're done reading the data. We originally resorted to MemoryStreams to let us pass the data around easily, but ran into the same problem.

We've used a couple different workarounds to mitigate the problem.

Options we've used include:

  • Implement a wrapper class to store the data in multiple (since arrays are still limited to int.MaxValue number of entries) byte[] objects and expose methods that enable you to almost treat them like a Stream. We still try to avoid this at all costs.
  • Use some sort of "token" to pass a reference to the location of the data and wait to load the data "just in time" in the application.

Upvotes: 1

Andrew Long
Andrew Long

Reputation: 11

I'd suggest checking out this project.

http://www.codeproject.com/Articles/348590/A-replacement-for-MemoryStream

I believe the problem with memory streams comes from the fact that underneath it all they are still a fancy wrapper for a single byte[] and so are still constrained by .net's requirement that all objects must be less than 2gb even in 64bit programs. The above implementation breaks the byte[] into several different byte[]s.

Upvotes: -1

Hans Passant
Hans Passant

Reputation: 941705

Programmers try too hard to avoid using a file. The difference between memory and a file is a very small one in Windows. Any memory you use for a MemoryStream in fact requires a file. The storage is backed by the paging file, c:\pagefile.sys. And the reverse is true as well, any file you use is backed by memory. File data is cached in RAM by the file system cache. So if the machine has sufficient RAM then you will in fact only read and write from/to memory if you use a FileStream. And get the perf you expect from using memory. It is entirely free, you don't have to write any code to enable this nor do you have to manage it.

If the machine doesn't have enough RAM then it deteriorates the same way. When you use a MemoryStream then the paging file starts trashing and you'll be slowed down by the disk. When you use a file then the data won't fit the file system cache and you'll be slowed down by the disk.

You'll of course get the benefit of using a file, you won't run out of memory anymore. Use a FileStream instead.

Upvotes: 9

Swift
Swift

Reputation: 1881

This is expected to happen using MemoryStream so you should implement you own logic or use some external class. here is a post that explains the problems with MemoryStream and big data and the post gives an alternative to MemoryStream A replacement for MemoryStream

Upvotes: 2

Related Questions