LeoMan
LeoMan

Reputation: 327

How MongoDB update a document if the updated field is larger than the original one?

I would like to know how MongoDB handles an update for a field if the updated value is larger than the original one? Does it rewrite the whole document?

For example, if the field is a string of 10 chars and the update is 12 chars, how MongoDB handles the update then?

Thanks

Upvotes: 0

Views: 405

Answers (2)

Joe
Joe

Reputation: 28366

I think you are referring to update in place as describe in this blog. Note that blog post is from 11 years ago.

That only applies when MongoDB is using the memory mapped storage engine. When using the MMAPv1 storage engine MongoDB maps each uncompressed data file into memory, and the operating system manages paging those in and out as necessary. Each document is stored with some empty space at the end in order to pad for future updates.

When a document is updated, it can be updated inplace if the new size is less than the old size plus the padding. The document only has to be relocated when it has grown enough to exhaust the padding.

In MMAP indexes map the indexed field value to the on-disk/in-file location of the document, so every time a document moves, every index that has an entry for that document must also be updated. This mean it will take at least 2 writes to update a document when it is relocated. MongoDB permits up to 64 indexes per collection, if there are that many indexes, relocating the document would require 65 writes.

However, that was then.

MongoDB purchased WiredTiger a half dozen years ago or so. The WiredTiger storage engine has been the default since around MongoDB 3.4. MMAP was deprecated in MongoDB 4.0, and support for it was removed in MongoDB 4.2.

The WiredTiger storage engine stores both documents and indexes using a modified B+ tree. The storage engine assigns an internal record identifier that is used as the key, and the document is stored as the value in the B-tree.

Writing a document will require at least 2 writes, one for the leaf page and one for the internal page of the B-tree, and might be as many as 5 for a deep tree.

WiredTiger uses multi-version concurrency control, which necessitates rewriting the entire document every time it is updated. However, indexes in WT map the indexes field value to the document's internal record identifier, which never changes, so when a field is updated, only the indexes that are actually affected by the modified field must be changed.

This makes a document update require a couple writes for the document, and a couple writes for each modified index. WiredTiger also has an internal cache, so if you have several updates that affect the same pages of the B-tree, the overall number of writes might be reduced.

In short, MongoDB no longer uses in-place updates at all, but when it did, it intentionally included empty space at the end of each document to minimize how often it would need to be relocated due to growing larger.

Upvotes: 3

AlexZeDim
AlexZeDim

Reputation: 4352

It depends on various things, such as driver usages, options passed with the query and so on.

Since MongoDB is NoSQL, it doesn't have schema by default. Usually, we implement it via database drivers. Let's take mongoose for example, to take a look at your question.

For example, mongoose driver separate Model methods, such as findOneAndReplace and findOneAndUpdate. As far as I know, Mongo will update each field of the document, instead of replacing it, and if the field not exists, then create it.

Also, Mongo has various limits on every method, for example here is for aggregation framework. So each doc in MongoDB (not matter, as a single document in the collection, or result on query aggregation) can't be more than 16Mb.

So, in some cases, your documents won't be even saved in / returned from the database.

Some limits can be set manually by DBA, other's are strictly limited, like this 16Mb size of document.

Upvotes: 0

Related Questions