Reputation: 2018
I'm trying to insert documents in bulk, I have created a unique index in my collection and want to skip documents which are duplicate while doing bulk insertion. This can be accomplished with native mongodb
function:
db.collection.insert(
<document or array of documents>,
{
ordered: <boolean>
}
)
I want to accomplish this with mongoengine
, If anybody knows how to achieve this, please answer the question, thanks.
Upvotes: 2
Views: 1148
Reputation: 790
For now I am using raw pymongo from mongoengine as a workaround for this. This is the 2nd workaround that @Alexey Smirnov mentioned. So for a mongoengine Document class DocClass you will access the underlying pymongo collection and execute query like below:
from pymongo.errors import BulkWriteError
try:
doc_list = [doc.to_mongo() for doc in me_doc_list] # Convert ME objects to what pymongo can understand
DocClass._get_collection().insert_many(doc_list, ordered=False)
except BulkWriteError as bwe:
print("Batch Inserted with some errors. May be some duplicates were found and are skipped.")
print(f"Count is {DocClass.objects.count()}.")
except Exception as e:
print( { 'error': str(e) })
Upvotes: 1
Reputation: 2613
If you have a class like this:
class Foo(db.Document):
bar= db.StringField()
meta = {'indexes': [{'fields': ['bar'], 'unique': True}]}
And having a list with Foo
instances foos=[Foo('a'), Foo('a'), Foo('a')]
and trying Foo.objects.insert(foos)
you will get mongoengine.errors.NotUniqueError
1st woraround would be delete index from mongodb, insert duplicates, and than ensure index with {unique : true, dropDups : true}
2nd workaround would be using underlying pymongo API for bulk ops: https://docs.mongodb.com/manual/reference/method/db.collection.initializeOrderedBulkOp/#db.collection.initializeOrderedBulkOp
Upvotes: 1