Reputation: 1325
I have a database with millions of Order
documents. I batch insert them with the following method:
Order.collection.insert([
{:_id=>BSON::ObjectId('5471944843687229cdfb0000'), :status=>"open", :name=> "Benny"},
{:_id=>BSON::ObjectId('5471944843687229cdfc0000'), :status=>"open", :name=> "Allan"}
])
I regularly need to update the status
attribute on the orders. It would be way to inefficient to update them individually with the update_attribute
method.
How do I bulk update multiple MongoDB documents?
The desired solution can best be described with the below "fictional" code:
# IMPORTANT: The exemplified upsert method does not exist
Order.collection.upsert([
{:_id=>BSON::ObjectId('5471944843687229cdfb0000'), :status=>"closed"},
{:_id=>BSON::ObjectId('5471944843687229cdfc0000'), :status=>"some_other_status"}
])
Fyi, there might be a similar question/answer in this SO post, but in all honesty I don't follow the answer.
Upvotes: 4
Views: 7294
Reputation: 13949
The best answer in the referenced question can be simplified to
id_status = [['5471944843687229cdfb0000','closed'], ...]
bulk_order = id_status.map do |id, status| # Using array destructuration
{ update_one:
{
filter: { _id: id },
update: { :'$set' => {
status: status,
}}
}
}
end
YourCollection.collection.bulk_write(bulk_order)
Upvotes: 6
Reputation: 3042
The real problem here are updates. Updates are slow because it needs to read, replace and change a document.
I've been blocked with in the same problem for many days. I didn't find any solution in stackoverflow nor in any other websites. Therefore, I wrote my own solution. Maybe you will find that it's not very "clean", but it works with excellent time results.
The solution consists in destroy an create again this document. Destroy is very fast, and create new documents using the bulk execution "collection.insert" is super fast.
def get_orders(*params)
Order.where(# some conditions).asc(:id)
end
namespace :my_collection_repairer do
desc ""
task update: :environment do
all_orders = get_orders(# some conditions)
while all_orders.count > 0
num_docs = all_orders.count
group_size = 10000
num_groups = (Float(num_docs) / group_size).ceil
puts "#{num_docs} documents found. #{num_groups} groups calculated."
1.upto(num_groups) do |group|
updated_order_list = []
order_group = all_orders.page(group).per(group_size)
puts "group #{group}"
order_group.each do |order|
updated_order = update_order(order) # this represents your custom update method
updated_order_list << updated_order.as_document
order.destroy
end
Order.collection.insert(updated_order_list)
puts "Group #{group} updated."
end
all_orders = get_orders(# some conditions)
end
end
end
Upvotes: 1
Reputation: 2295
Sets the upsert option to true for an update or a replacement operation and has the following syntax:
bulk.find( { status: "closed" } ).update( { $set: { status: "some_other_status" } } );
bulk.execute();
Adds a multi update operation to a bulk operations list. The method updates specific fields in existing documents.
Use the Bulk.find() method to specify the condition that determines which documents to update. The Bulk.find.update() method updates all matching documents. To specify a single document update, see Bulk.find.updateOne().
var bulk = db.collection.initializeUnorderedBulkOp();
bulk.find( { status: "closed" } ).upsert().update(
{
$set: { status: "some_other_status"}
}
);
bulk.execute();
Note:
To specify upsert: true for this operation, use Bulk.find.upsert(). With Bulk.find.upsert(), if no documents match the Bulk.find() query condition, the update operation inserts only a single document. Hope this helps.
Upvotes: -1
Reputation: 2081
Firstly, you'll need to filter Orders
for only those matching ids of orders_to_update
.
You filter them with the any_in Criteria method. Then update all of them in bulk with update_all.
Like so:
orders_to_update = [BSON::ObjectId('5471944843687229cdfb0000'), BSON::ObjectId('5471944843687229cdfc0000')]
Order.any_in(id: orders_to_update).update_all(status: "closed")
Upvotes: 3