Reputation: 99
So I have a collection looking like this:
[
{"url":"website.com/test", "links":[ {"url": "www.something.fr/page.html","scoreDiff": 0.44} ], "justUpdated": true, "score": 0.91},
{"url":"domain.com/", "links":[], "justUpdated": true, "score": 0.81},
{"url":"www.something.fr/page.html", "links":[], "justUpdated": false, "score": 0.42},
]
The goal here is to get the third document, because in one of the documents where "justUpdated" equals true (the first one here), there is its url as a value in one of the "links" array elements.
To achieve that, I tried:
To find all the documents with "justUpdated" equals to true, then in NodeJS concatenate all the urls in their "links" arrays (let's call this array urlsOfInterest
). And finally do another query to find all the documents where the url is in urlsOfInterest
.
The problem is that it takes some time to do the first query then process the result and do the second query.
So I thought maybe I could do it all at once in an aggregate query. I use $group
(with $cond
to check if justUpdated
equals true) to get all the arrays of "links" in one new variable named urlsOfInterest
. For now this is an array of arrays of object so I then use $project
with $reduce
to have all these {url: "...", score: X}
objects as one big array. Finally I use $project
and $map
to only have the url as the score value doesn't interest me here.
So I get an output looking like this:
_id:"urlsOfInterest",
urlsOfInterest: ["www.something.fr/page.html"]
Which is pretty great but I am stuck because now I just need to get the documents where url
is in this variable named urlsOfInterest
except I can't because all my documents have "disappeared" after the $group
stage.
Please help me to find a way to perform this final query :) Or if this isn't the right way to do this, please point me in the right direction !
PS: the real goal here would be to update for all the documents where justUpdated
equals true, every scoreDiff
values in their links
array. For our exemple, we do abs(0.91 - 0.42) = 0.49
so we replace our scoreDiff
value of 0.44 by 0.49 (0.91 being the score of the document where justUpdated
equals true and 0.42 the score of the document where url
equals www.something.fr/page.html
, explaining why I need to fetch this last document.) I don't believe there could be a way of doing all of this at once but if there is, please tell me !
Upvotes: 0
Views: 78
Reputation: 37048
You can use $lookup to get all matching documents in an array:
db.collection.aggregate([
{
"$match": {
"justUpdated": true
}
},
{
"$lookup": {
"from": "collection",
"localField": "links.url",
"foreignField": "url",
"as": "result"
}
},
{
"$match": {
"result": {
$gt: []
}
}
}
])
Then either $unwind and $replaceRoot the results
array to get the documents as a cursor and do the math on the application level or do the calculations in the same pipeline, e.g. with $reduce
The "PS: the real goal" is not quite clear as it is based on a particular example but if you play a little bit with it in the playground I am sure you can calculate the numbers as per your requirements.
Upvotes: 2