Reputation: 460
Using PyMongo, I'm inner joining MongoDB documents in one collection (refer_docs) with documents in another collection (test), and I want to find those documents in which the values from a field in each document of the join don't match, and I want to update those values. The refer_docs document should say that the test document it points to has the correct document type.
I can find those documents, but I'd like to do the update without looping through a list of the aggregation results and running updates one at a time.
Is there a Mongo/PyMongo method to use an aggregate pipeline as a query within an update, or to use $set within aggregate()? Or?
The pipeline starts as a join between collections with documents like the following examples:
import pymongo
mongo_client = MongoClient('localhost:27017')
osm_db = mongo_client.osm
refer_docs_col = osm_db["refer_docs"]
test_col = osm_db["test"]
# test collection:
test_col.insert_many(
[
{'_id': '611868136', 'doc_type': 'way'},
{'_id': '5792648632', 'doc_type': 'node'},
{'_id': '611868133', 'doc_type': 'node'},
{'_id': '1', 'doc_type': 'node'}
]
)
# refer_docs collection:
refer_docs_col.insert_many(
[
{'_id': '8483444',
'refs': [{'ref': '611868136', 'ref_type': 'way'},
{'ref': '5792648632', 'ref_type': 'node'},
{'ref': '611868133', 'ref_type': 'way'}],
'doc_type': 'relation'}
]
)
Now, here's the pipeline and one attempt at updating the collection, which is essentially looping through the results, and which doesn't work:
pipeline = [
{ "$unwind" : "$refs" },
{
"$lookup" : {
"from" : "test",
"localField" : "refs.ref",
"foreignField" : "_id",
"as" : "ref_doc"
}
},
{ "$match" : { "ref_doc" : { "$ne" : [] } } },
{ "$unwind" : "$ref_doc"},
{ "$project" : { "_id" : 1, "refs" : 1, "ref_doc.doc_type" : 1,
"cmp" : { "$cmp" : [ "$refs.ref_type",
"$ref_doc.doc_type" ] } } },
{ "$match" : { "cmp" : { "$ne" : 0 } } },
]
result = [
refer_docs_col.find_one_and_update( doc,
{ "$set" : { "refs.ref_type" : "$ref_doc.doc_type" } } ) \
for doc in refer_docs_col.aggregate(pipeline)
]
refer_docs_col.find_one( { "_id" : "8483444" } )
This doesn't work, but I want to see that the document has been updated so that the ref_type of ref "611586133" is now "node":
{'_id': '8483444',
'refs': [{'ref': '611868136', 'ref_type': 'way'},
{'ref': '5792648632', 'ref_type': 'node'},
{'ref': '611868133', 'ref_type': 'node'}],
'doc_type': 'relation'}
Upvotes: 0
Views: 292
Reputation: 460
This worked.
Here, I used "refs" : mismatch["refs"]
to find the array element in the filter, and "refs.$"
to set the found array element.
for mismatch in result: # Loop aggregation result docs
filtr = { "_id" : mismatch["_id"] ,
"refs" : mismatch["refs"]}
update = { "$set" : { "refs.$" : {
"ref" : mismatch["ref_doc"]["_id"],
"ref_type" : mismatch["ref_doc"]["doc_type"] } } }
doc = refer_docs_col.update_one(filtr, update)
refer_docs_col.find_one( { "_id" : "8483444" } )
Output:
{'_id': '8483444',
'refs': [{'ref': '611868136', 'ref_type': 'way'},
{'ref': '5792648632', 'ref_type': 'node'},
{'ref': '611868133', 'ref_type': 'node'}],
'doc_type': 'relation'}
Upvotes: 0