in mongo can I insert a record only if it's not already there

Question

I'm using immutable objects keyed with a unique hash. My client doesn't know if the object is the database or not. I want to do an insert only if the object does not already exist.

I know I can query and then insert but that involves an extra round trip.

I know I can just insert and generate a duplicate key exception (in my client) but that involves awkward exception processing.

I know I can upsert but that involves database operations that are totally unnecessary.

In sql I'd use a stored procedure server-side. But I am not clear how (if at all) the server side javascript would work for this. Any suggestions?

Neil Lunn · Accepted Answer

I really think that if you look at the Bulk Operations API as implemented from MongoDB 2.6 and greater, then you should see that updates are not an issue with touching the database content at all.

For example the MongoDB shell methods all now actually use the Bulk Operations API "under the hood". This is immediately evident from the WriteResult object returned in the shell, but you can also check the method signature definition to reveal this.

A basic test:

db.uptest.update(
    { "data": "test" },
    { "$set": { "data": "test }}, 
    { "upsert": true }
)

WriteResult({
    "nMatched" : 0,
    "nUpserted" : 1,
    "nModified" : 0,
    "_id" : ObjectId("54f97d8ee43b0048f69c5d88")
})

So that was inserted, now when you issue again:

db.uptest.update(
    { "data": "test" },
    { "$set": { "data": "test }}, 
    { "upsert": true }
)

WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 0 })

Did nothing. Because the data that was sent in the update is the same as the data that resides there. MongoDB has the smarts to know that now, and will not actually perform other I/O once that is determined.

Aside from that I'm not sure what you think you want to achieve. How would you possibly "know" what was in the database without asking it? The only other way to achieve this would be to "cache" the data in some way in your application, then ask the cache first before making the statement to the database. This is probably redundant and mostly unnecessary given that the calls should be pretty quick in their own right.

There is of course always the $setOnInsert modifier for the highly paranoid or for use in that specific purpose. This will absolutely not write anything unless the operation is specifically an "insert".

In the longer form using the API directly the call would look like this:

var bulk = db.uptest.initializeOrderBulkOperation();
bulk.find({ "data": "test"})
    .upsert()
    .updateOne({ "$setOnInsert": { "data": "test" } });
bulk.execute();

That's pretty consistent over all language implementations in general principle.

MongoDB 2.6 is has been around for a while now, and most vendors are using it as standard deployment. MongoDB 3.0 has just been released as of writing.

So the new write protocols available should be what you are using. The driver default methods are likely to default to this over time, at least doing the same type of "server version checking" as implemented in the shell methods.

So even for single calls, ( and you should probably be organizing bulk calls for anything but pure "one shot" updates ), it makes since to implement your logic using the newer API. Also make use of $setOnInsert where appropriate.

in mongo can I insert a record only if it's not already there

Answers (1)

Related Questions

in mongo can I insert a record only if it&#39;s not already there

Answers (1)

Related Questions

in mongo can I insert a record only if it's not already there