Reputation: 681
i am trying to make a $lookup over 2 collections where the 1st collection has single points, in the 2nd collection are polygons... now i want to query all points within the polygon and build an avg value from the 1st collection.
my actual aggregation:
'aggregation': {
'pipeline': [
# must match the field location from my first collection
{ "$match": {
"location" : "$loc"
} },
# lookup to join location from 1st collection and geometry from 2nd collection
{ "$lookup": {
"from": "polygons",
"localField": "location",
"foreignField": "geometry",
"as": "join" } },
# output grouped
{ "$group": {
"_id": "Average",
"Geometry": "$join.geometry" ,
"AvgVal": { "$avg": "$myVal" } ,
"count": {"$sum": 1} } },
]
}
i hope anybody is able to understand what i mean ;-)
edit: document 1:
{ id: "ABC", location: {type: "point", coordinates: [0,1]}, myVal: 1 },
{ id: "DEF", location: {type: "point", coordinates: [2,3]}, myVal: 2 }
{ id: "GHI", location: {type: "point", coordinates: [9,8]}, myVal: 3 }
{ id: "JKL", location: {type: "point", coordinates: [7,6]}, myVal: 4 }
{ id: "MNO", location: {type: "point", coordinates: [5,4]}, myVal: 5 }
document 2:
{ id: "Vienna Part1", geometry: {type: "polygon", coordinates: [[[0,1],[1,2],[2,3],[0,1]]] } },
{ id: "Vienna Part2", geometry: {type: "polygon", coordinates: [[[9,8],[7,6],[5,4],[9,8]]] } },
now i send a query with a polygon(map on my screen): $geoIntersect polygon.. [0,1]...[9,8]
expected:
{ id: "Vienna Part1", AvgVal: 1.5, geometry: {type: "polygon", coordinates: [[[0,1],[1,2],[2,3],[0,1]]] }
{ id: "Vienna Part2", AvgVal: 4, geometry: {type: "polygon", coordinates: [[[9,8],[7,6],[5,4],[9,8]]] }
lg Harald
Upvotes: 0
Views: 1074
Reputation: 50436
You don't seem to be understanding that the concept of $lookup
is that for a field to "match" on lookup it must have the same data. Since the data contained in one document is "double nested" in an array in "Polygon" notation, it is necessary to $unwind
the array content "twice" in order to get the "match".
Setting this up as a whole example, let's create one collection with your "point" documents:
db.geo1.insertMany([
{ _id: "ABC", location: {type: "point", coordinates: [0,1]}, myVal: 1 },
{ _id: "DEF", location: {type: "point", coordinates: [2,3]}, myVal: 2 },
{ _id: "GHI", location: {type: "point", coordinates: [9,8]}, myVal: 3 },
{ _id: "JKL", location: {type: "point", coordinates: [7,6]}, myVal: 4 },
{ _id: "MNO", location: {type: "point", coordinates: [5,4]}, myVal: 5 }
])
And another including the "polygon" documents, deliberately including one that will not match:
db.geo2.insertMany([
{ _id: "Vienna Part1", geometry: {type: "polygon", coordinates: [[[0,1],[1,2],[2,3],[0,1]]] } },
{ _id: "Vienna Part2", geometry: {type: "polygon", coordinates: [[[9,8],[7,6],[5,4],[9,8]]] } },
{ _id: "Vienna Part3", geometry: {type: "polygon", coordinates: [[[10,1],[10,3],[3,10],[10,1]]] } }
])
In order to inspect the geo2
collection for items in geo1
that "intersect" via the same point coordinates then you must do:
db.geo2.aggregate([
{ "$unwind": "$geometry.coordinates" },
{ "$unwind": "$geometry.coordinates" },
{ "$lookup": {
"from": "geo1",
"localField": "geometry.coordinates",
"foreignField": "location.coordinates",
"as": "geo1"
}},
{ "$group": {
"_id": "$_id",
"coordinates": {
"$push": "$geometry.coordinates"
},
"matches": {
"$push": { "$ne": [ "$geo1", [] ] }
}
}},
{ "$redact": {
"$cond": {
"if": { "$anyElementTrue": "$matches" },
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$project" : {
"geometry": {
"type": { "$literal": "polygon" },
"coordinates": ["$coordinates"]
}
}},
{ "$sort": { "_id": 1 } }
])
So the first thing is the double $unwind
to just get the "point" data out of the arrays. Then you can do the $lookup
.
{ "_id" : "Vienna Part1", "geometry" : { "type" : "polygon", "coordinates" : [ 0, 1 ] }, "geo1" : [ { "_id" : "ABC", "location" : { "type" : "point", "coordinates" : [ 0, 1 ] }, "myVal" : 1 } ] }
{ "_id" : "Vienna Part1", "geometry" : { "type" : "polygon", "coordinates" : [ 1, 2 ] }, "geo1" : [ ] }
{ "_id" : "Vienna Part1", "geometry" : { "type" : "polygon", "coordinates" : [ 2, 3 ] }, "geo1" : [ { "_id" : "DEF", "location" : { "type" : "point", "coordinates" : [ 2, 3 ] }, "myVal" : 2 } ] }
{ "_id" : "Vienna Part1", "geometry" : { "type" : "polygon", "coordinates" : [ 0, 1 ] }, "geo1" : [ { "_id" : "ABC", "location" : { "type" : "point", "coordinates" : [ 0, 1 ] }, "myVal" : 1 } ] }
{ "_id" : "Vienna Part2", "geometry" : { "type" : "polygon", "coordinates" : [ 9, 8 ] }, "geo1" : [ { "_id" : "GHI", "location" : { "type" : "point", "coordinates" : [ 9, 8 ] }, "myVal" : 3 } ] }
{ "_id" : "Vienna Part2", "geometry" : { "type" : "polygon", "coordinates" : [ 7, 6 ] }, "geo1" : [ { "_id" : "JKL", "location" : { "type" : "point", "coordinates" : [ 7, 6 ] }, "myVal" : 4 } ] }
{ "_id" : "Vienna Part2", "geometry" : { "type" : "polygon", "coordinates" : [ 5, 4 ] }, "geo1" : [ { "_id" : "MNO", "location" : { "type" : "point", "coordinates" : [ 5, 4 ] }, "myVal" : 5 } ] }
{ "_id" : "Vienna Part2", "geometry" : { "type" : "polygon", "coordinates" : [ 9, 8 ] }, "geo1" : [ { "_id" : "GHI", "location" : { "type" : "point", "coordinates" : [ 9, 8 ] }, "myVal" : 3 } ] }
{ "_id" : "Vienna Part3", "geometry" : { "type" : "polygon", "coordinates" : [ 10, 1 ] }, "geo1" : [ ] }
{ "_id" : "Vienna Part3", "geometry" : { "type" : "polygon", "coordinates" : [ 10, 3 ] }, "geo1" : [ ] }
{ "_id" : "Vienna Part3", "geometry" : { "type" : "polygon", "coordinates" : [ 3, 10 ] }, "geo1" : [ ] }
{ "_id" : "Vienna Part3", "geometry" : { "type" : "polygon", "coordinates" : [ 10, 1 ] }, "geo1" : [ ] }
The resulting data is showing the "matched" element(s) from the geo1
collection within the new array element "geo1"
in the documents. This is a kind of "left join", so there is either an array with any match(es) or an empty array.
Your next point is we want to $group
back to something more like the original document form, and here we test to see if the result of $lookup
produced an empty array or not. This means that the corresponding document was "not found" in the other collection:
{
"_id" : "Vienna Part3",
"coordinates" : [ [ 10, 1 ], [ 10, 3 ], [ 3, 10 ], [ 10, 1 ] ],
"matches" : [ false, false, false, false ]
}
{
"_id" : "Vienna Part2",
"coordinates" : [ [ 9, 8 ], [ 7, 6 ], [ 5, 4 ], [ 9, 8 ] ],
"matches" : [ true, true, true, true ]
}
{
"_id" : "Vienna Part1",
"coordinates" : [ [ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 0, 1 ] ],
"matches" : [ true, false, true, true ]
}
What you should see here is that one of the documents has a resulting "matches"
array where every element is false
. This is important for the following $redact
condition, which does a test on that array with $anyElementTrue
.
So where the array is seen in the document where all elements are false
, this means that there was "no intersection" for this particular set of points. As such, that document will be discarded via $$PRUNE
where the other documents containing "at least one" true
would be kept.
The only thing remaining is a little cosmetic tranformation to give you back the desired result:
{
"_id" : "Vienna Part1",
"geometry": {
"type": "polygon",
"coordinates" : [[ [ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 0, 1 ] ]]
}
}
{
"_id" : "Vienna Part2",
"geometry": {
"type": "polygon",
"coordinates" : [[ [ 9, 8 ], [ 7, 6 ], [ 5, 4 ], [ 9, 8 ] ]]
}
}
Of course, different geometry types complicates the process a little more, but that is still the basic process.
In order to $lookup
you must transform the data into a form that would match the target collection field being inspected.
When looking at results in a "set", you can test each element for a logical result and then evaluate with $anyElementTrue
. This will tell you if something matched without destroying the individual elements.
So as well as the first condition, the second condition makes sure that your "Vienna Part1"
document which contains points that do not match "all" of the documents in the other collection is also a match because "at least one" element had a true
evaluation for a match.
Those are the rules. So I hope you understand it better now.
Upvotes: 2