Reputation: 899
I have a collection c1 with a value like {'Race': 'blck'}
and I want to use another collection c2 with fields {'raw': 'blck', 'mapped_race': 'black'}
to update the document in c1 with a new field like {'Race_Standardized': 'black'}
. This would be accomplished by matching the value of Race in c1 to the document in c2 on the raw value.
Update would make c1 document have fields {'Race': 'blck', 'Race_Standardized': black'}
.
How do I go about doing this in an aggregation pipeline? (I'm working in PyMongo.)
Upvotes: 1
Views: 454
Reputation: 28356
It might be more efficient to query the c2 collection and iterate the documents on the client side to build a bulk write operation for c1:
updates = []
for doc in db.c2.find({}):
updates.append(pymongo.UpdateMany({'Race':doc.get('raw')},{'$set':{'Race_Standardized':doc.get('mapped_race')}}))
result = db.c1.bulk_write(updates)
Upvotes: 2
Reputation: 5679
this should do it:
db.c1.aggregate([
{
$lookup: {
from: "c2",
localField: "Race",
foreignField: "raw",
as: "Race_Standardized"
}
},
{
$set: {
Race_Standardized: {
$first: "$Race_Standardized.mapped_race"
}
}
},
{
$out: "c1"
}
])
but keep in mind that the $out stage will overwrite the c1 collection.
Upvotes: 1
Reputation: 161
Since Mongo is a nosql DB there is no join like we have in relational db. However this is overcome by the $lookup feature within the aggregate pipeline. I am yet to try this out within a pymongo framework but in mongo you will have to use a combination of $lookup , $unwind and the $out to update the field. The $lookup will be like a left out join in SQL world and that returns an array - we have to use $unwind to get the specific field and then $out to update back or write to a new collection. I found this link helpful [https://developer.mongodb.com/community/forums/t/update-a-collection-field-based-on-another-collection/4875]
Upvotes: 0