Reputation: 1111
I have a .json file that I created using using mongoexport and then I removed the _id elements from the objects. I would like to take this json file and then import it into another collection using mongoimport, but I want to skip over any of the objects that are duplicates of anything already in the collection (minus the _id tag since it no longer exists in the data being imported). Is there a way to do this?
Upvotes: 3
Views: 2460
Reputation: 71
one can crate an unique index on all key fields on target collection then use regular mongoimport, it will automatically ignore the duplication for you.
in example: imp collection contains 2 documents
>db.imp.find()
{ "_id" : ObjectId("559eb4d112bc601a37ba6c0e"), "a" : 1, "b" : 1, "c" : 1, "d" : "first" }
{ "_id" : ObjectId("559eb4e512bc601a37ba6c0f"), "a" : 2, "b" : 2, "c" : 2, "d" : "second" }
a,b and c are key fields, create an unique index on those fields
> db.imp.ensureIndex({a:1,b:1,c:1},{unique:true})
json file (imp.json) duplicate with existing (first two) records + another duplicate on a:3,b:3 and c:3
{ "a" : 1, "b" : 1, "c" : 1, "d" : "one" }
{ "a" : 2, "b" : 2, "c" : 2, "d" : "two"}
{ "a" : 3, "b" : 3, "c" : 3, "d" : "third"}
{ "a" : 3, "b" : 3, "c" : 3, "d" : "three"}
mongoimport, on mongo 3.0 you can use --maintainInsertionOrder to inserts the documents in the order of their appearance in the input source
$ mongoimport -d imp -c imp --file imp.json
import result and duplicate key error on index
connected to: 127.0.0.1
2015-07-10T01:14:40.457+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1 dup key: { : 1, : 1, : 1 }
2015-07-10T01:14:40.458+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1 dup key: { : 2, : 2, : 2 }
2015-07-10T01:14:40.458+0700 insertDocument :: caused by :: 11000 E11000 duplicate key error index: imp.imp.$a_1_b_1_c_1 dup key: { : 3, : 3, : 3 }
2015-07-10T01:14:40.459+0700 imported 4 objects
finally the imp collection will look like
> db.imp.find()
{ "_id" : ObjectId("559eb4d112bc601a37ba6c0e"), "a" : 1, "b" : 1, "c" : 1, "d" : "first" }
{ "_id" : ObjectId("559eb4e512bc601a37ba6c0f"), "a" : 2, "b" : 2, "c" : 2, "d" : "second" }
{ "_id" : ObjectId("559eba10394aeed912d00d31"), "a" : 3, "b" : 3, "c" : 3, "d" : "third" }
Hope this help!
Upvotes: 1
Reputation: 2636
No. You would have to write some kind of script in the Mongo shell or program that would go through and manually compare the items.
Upvotes: 1