Reputation: 5870
MongoDB gridfs says the big advantage is that splitting big file to chunks, and then you don't have to load entire file to memory if you just want to see part of the file. But my confusion is that even though I open a big file from local disk I can just use skip() API to just load part of the file which I wanted. I don't have to load the entire file at all. So how come MongoDB says that is the advantage?
Upvotes: 4
Views: 525
Reputation: 110
Even though cursor.skip() method does not return the entire file, it has to load it into memory. It requires the server to walk from the beginning of the collection or index to get the offset or skip position before beginning to return results(Doesn't greatly affect when collection is small in size). As the offset increases, cursor.skip() will become slower and more CPU intensive. With larger collections, cursor.skip() may become IO bound.
However, Instead of storing a file in a single document, GridFS divides the file into parts, or chunks, and stores each chunk as a separate document. Thus, allowing the user to access information from arbitrary sections of files, such as to “skip” to the middle of file(using id or filename) without being CPU intensive.
Official documentations: 1.Skip 2.GridFS.
Update:
About what Peter Brittain is suggesting:
There are many things to consider(infrastructure,presumed usage stats,file size etc.) while one is choosing between filesystem and GridFS.
For example: If you have millions of files, GridFS tends to handle it better, also you need to consider file system limitations like the maximum number of files/directory etc.
You might want to consider going through this article: Why use GridFS over ordinary Filesystem Storage?
Upvotes: 3