Reputation: 3615
I'm in the process of porting my application from an App Engine Datastore to a MongoDB backend and have a question regarding the consistency of "document updates." I understand that the updates on one document are all atomic and isolated, but is there a way to guarantee that they're "consistent" across different replica sets?
In our application, many users can (and will) be trying to update one document at the same time by inserting a few embedded documents (objects) into it during one single update. We need to ensure these updates occur in a logically consistent manner across all replicas, i.e. when one user "puts" a few embedded documents into the parent document, no other users can put their embedded documents in the parent document until we ensure they've read and received the first user's updates.
So what I mean by consistency is that we need a way to ensure that if two users attempt to perform an update on one document at exactly the same time, MongoDB only allows one of those updates to go through, and discards the other one (or at least prevents both from occuring). We can't use a standard "sharding" solution here, because a single update consists of more than just an increment or decrement.
What's the best way of guaranteeing the consistency of one particular document?
Upvotes: 17
Views: 7181
Reputation: 46311
MongoDB does not offer master-master replication or multi-version concurrency. In other words, writes always go to the same server in a replica set. By default, even reads from secondaries are disabled so the default behavior is that you communicate only with one server at a time. Therefore, you do not need to worry about inconsistent results in safe mode if you use atomic modifiers (like $inc, $push
, etc.).
If you don't want to restrict yourself to these atomic modifiers, compare and swap as recommended by dcrosta (and the mongo docs) looks like a good idea. All this is not related to replica sets or sharding, however - it would be the same in a single-server scenario.
If you need to ensure read consistency also in case of a database/node failure, you should make sure you're writing to the majority of servers in safe mode.
The two approaches behave different if you allow unsafe reads: the atomic update operations would still work (but may give unexpected results), while the compare-and-swap approach would fail.
Upvotes: 3
Reputation: 26258
There may be other ways to accomplish this, but one approach is to version your documents, and issue updates against only the version that the user had previously read (i.e., ensure that no one else has updated the document since it was last read). Here's a brief example of this technique using pymongo:
>>> db.foo.save({'_id': 'a', 'version': 1, 'things': []}, safe=True)
'a'
>>> db.foo.update({'_id': 'a', 'version': 1}, {'$push': {'things': 'thing1'}, '$inc': {'version': 1}}, safe=True)
{'updatedExisting': True, 'connectionId': 112, 'ok': 1.0, 'err': None, 'n': 1}
note in the above, key "n" is 1, indicating that the document was updated
>>> db.foo.update({'_id': 'a', 'version': 1}, {'$push': {'things': 'thing2'}, '$inc': {'version': 1}}, safe=True)
{'updatedExisting': False, 'connectionId': 112, 'ok': 1.0, 'err': None, 'n': 0}
here where we tried to update against the wrong version, key "n" is 0
>>> db.foo.update({'_id': 'a', 'version': 2}, {'$push': {'things': 'thing2'}, '$inc': {'version': 1}}, safe=True)
{'updatedExisting': True, 'connectionId': 112, 'ok': 1.0, 'err': None, 'n': 1}
>>> db.foo.find_one()
{'things': ['thing1', 'thing2'], '_id': 'a', 'version': 3}
Note that this technique relies on using safe writes, otherwise we don't get an acknowledgement indicating the number of documents updated. A variation on this would use the findAndModify
command, which will either return the document, or None
(in Python) if no document matching the query was found. findAndModify
allows you to return either the new (i.e. after updates are applied) or old version of the document.
Upvotes: 19