Brad.Smith
Brad.Smith

Reputation: 1111

importing objects into mongodb collection that are not duplicates that have had their _id removed

I have a .json file that I created using using mongoexport and then I removed the _id elements from the objects. I would like to take this json file and then import it into another collection using mongoimport, but I want to skip over any of the objects that are duplicates of anything already in the collection (minus the _id tag since it no longer exists in the data being imported). Is there a way to do this?

Upvotes: 3

Views: 2460

Answers (2)

tug
tug

Reputation: 71

one can crate an unique index on all key fields on target collection then use regular mongoimport, it will automatically ignore the duplication for you.

in example: imp collection contains 2 documents

>db.imp.find()
{ "_id" : ObjectId("559eb4d112bc601a37ba6c0e"), "a" : 1, "b" : 1, "c" : 1, "d" : "first" }
{ "_id" : ObjectId("559eb4e512bc601a37ba6c0f"), "a" : 2, "b" : 2, "c" : 2, "d" : "second" }

a,b and c are key fields, create an unique index on those fields

> db.imp.ensureIndex({a:1,b:1,c:1},{unique:true})

json file (imp.json) duplicate with existing (first two) records + another duplicate on a:3,b:3 and c:3

{ "a" : 1, "b" : 1, "c" : 1, "d" : "one" }
{ "a" : 2, "b" : 2, "c" : 2, "d" : "two"}
{ "a" : 3, "b" : 3, "c" : 3, "d" : "third"}
{ "a" : 3, "b" : 3, "c" : 3, "d" : "three"}

mongoimport, on mongo 3.0 you can use --maintainInsertionOrder to inserts the documents in the order of their appearance in the input source

$ mongoimport -d imp -c imp --file imp.json

import result and duplicate key error on index

connected to: 127.0.0.1
2015-07-10T01:14:40.457+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1  dup key: { : 1, : 1, : 1 }
2015-07-10T01:14:40.458+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1  dup key: { : 2, : 2, : 2 }
2015-07-10T01:14:40.458+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1  dup key: { : 3, : 3, : 3 }
2015-07-10T01:14:40.459+0700 imported 4 objects

finally the imp collection will look like

> db.imp.find()
{ "_id" : ObjectId("559eb4d112bc601a37ba6c0e"), "a" : 1, "b" : 1, "c" : 1, "d" : "first" }
{ "_id" : ObjectId("559eb4e512bc601a37ba6c0f"), "a" : 2, "b" : 2, "c" : 2, "d" : "second" }
{ "_id" : ObjectId("559eba10394aeed912d00d31"), "a" : 3, "b" : 3, "c" : 3, "d" : "third" }

Hope this help!

Upvotes: 1

ThrowsException
ThrowsException

Reputation: 2636

No. You would have to write some kind of script in the Mongo shell or program that would go through and manually compare the items.

Upvotes: 1

Related Questions