Reputation: 167
We need to calculate the minimum bounding rectangle (MBR) on Geospatial data.
In oracle, we have SDO_AGGR_MBR
function, is there any similar function in MongoDB.
"coord" : {
"type" : "Polygon",
"coordinates" : [
[
[
25.5377574375611,
42.8545750237221
],
[
47.7803203666229,
42.8545750237221
],
[
47.7803203661319,
52.0987759993153
],
[
25.5377574370701,
52.0987759993153
],
[
25.5377574375611,
42.8545750237221
]
]
]
}
We have the geometry data like above but the coordinates length may vary. So, we need to find the minx
, miny
, maxx
and maxy
from these data.
Upvotes: 2
Views: 329
Reputation: 14698
I don't think there is a built-in function for it. But you can simply do the following;
db.collection.aggregate([
{
$unwind: "$coordinates"
},
{
$unwind: "$coordinates"
},
{
$group: {
_id: null,
minX: {
$min: { $arrayElemAt: [ "$coordinates", 0 ] }
},
maxX: {
$max: { $arrayElemAt: [ "$coordinates", 0 ] }
},
minY: {
$min: { $arrayElemAt: [ "$coordinates", 1 ] }
},
maxY: {
$max: { $arrayElemAt: [ "$coordinates", 1 ] }
}
}
}
])
Where first unwraps the coordinates
array with $unwind
(twice for extra [ ]
block), so that aggregate pipeline
can iterate on it. Then we just use a $group
with _id: null
which is a special operation to evaluate min/max values for all elements of the array.
which will get you a response you request;
[
{
"_id": null,
"maxX": 47.7803203666229,
"maxY": 52.0987759993153,
"minX": 25.5377574370701,
"minY": 42.8545750237221
}
]
check on mongoplayground
Upvotes: 1
Reputation: 151142
The most efficient way is to employ the $map
and $reduce
operators along with $let
. This allows you to process each document by manipulating the array's inline and then use $min
and $max
operators to obtain the bounding values:
db.collection.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$let": {
"vars": {
"m": {
"$map": {
"input": {
"$reduce": {
"input": "$coord.coordinates",
"initialValue": [],
"in": {
"$concatArrays": [ "$$value", "$$this"]
}
}
},
"in": {
"x": { "$arrayElemAt": [ "$$this", 0 ] },
"y": { "$arrayElemAt": [ "$$this", 1 ] }
}
}
}
},
"in": {
"_id": "$_id",
"coord": "$coord",
"minX": { "$min": "$$m.x" },
"minY": { "$min": "$$m.y" },
"maxX": { "$max": "$$m.x" },
"maxY": { "$max": "$$m.y" }
}
}
}
}}
])
And the output, would be like:
{
"_id" : ObjectId("5d9330e95994eb7018f59218"),
"coord" : {
"type" : "Polygon",
"coordinates" : [
[
[
25.5377574375611,
42.8545750237221
],
[
47.7803203666229,
42.8545750237221
],
[
47.7803203661319,
52.0987759993153
],
[
25.5377574370701,
52.0987759993153
],
[
25.5377574375611,
42.8545750237221
]
]
]
},
"minX" : 25.5377574370701,
"minY" : 42.8545750237221,
"maxX" : 47.7803203666229,
"maxY" : 52.0987759993153
}
Note the usage of the $replaceRoot
aggregation pipeline stage, as this will allow the nested expressions with $let
to essentially provide global variables to the document to produce that can be utilized in any output property.
The $reduce
here basically serves the function to flatten the array from the standard GeoJSON form into just an array of coordinate pairs, without the additional bounding array.
This then feeds input to the $map
which employs $arrayElemAt
in order to re-map each coordinate pair into an object with x
and y
keys. This makes things much more simple for the actual execution or output section of the $let
.
Note: An alternate approach to using
$arrayElemAt
against each key within the$map
might well be to use$zip
and$arrayToObject
:"in": { "$arrayToObject": { "$zip": { "inputs": [ ["x","y"], "$$this" ] } } }
It has the same principle in the overall output, but takes advantage of
$zip
producing "paired" arrays, which also happens to be valid input for$arrayToObject
to produce the final object form.
In the final part, we now basically have an array of objects with the named keys x
and y
. MongoDB allows a convenient way to remap simply the values for those named keys with notation like "$$m.x"
where the "$$m
expression related to the named variable of the $let
and is our array of objects, and the .x
part of course simply means only the values of x
. This is a basic shorthand for a $map
statement in itself, which suits this special usage case.
These arrays of values for specific properties can now be applied to the $min
and $max
operators, and this is how you get you min
and max
coordinates for a bounding rectangle.
Note that inline operators for arrays should always be preferred to $unwind
.
The $unwind
aggregation pipeline stage was an old introductory way of dealing with array elements by essentially flattening them into separate documents.
Though necessary when you actually want to group on a value as a key that comes from within an array, most operations which don't actually need that ( like this one ) can be done with more modern approaches.
The usage of $unwind
is actually a huge performance penalty, due to it's function essentially being to replicate the entire parent document content of the field containing the array into it's own new document. Particularly in large datasets this has a very negative effect on performance, due to much increases I/O and memory usage.
The main lesson being, unless it's necessary to the operation being performed ( and here it is not ) then you should not be using $unwind
in an aggregation pipeline. It might look easier to understand, but your code is actually hurting the system it's running on by including it.
Note also that if you don't actually need these results for any further aggregation processing, then it's probably a lot cleaner to code in the client as each document is processed.
For example, here's a plain JavaScript version for the shell:
db.collection.find().map(({ _id, coord }) =>
(({ coordinates }) =>
({
_id,
coord,
...(m =>
({
minX: Math.min(...m.map(({ x }) => x)),
minY: Math.min(...m.map(({ y }) => y)),
maxX: Math.max(...m.map(({ x }) => x)),
maxY: Math.max(...m.map(({ y }) => y))
})
)(
((c) => c.reduce((o,e) => [ ...o, ...e ],[]).map(([x,y]) => ({ x, y })) )(coordinates)
)
})
)(coord)
)
That has exactly the same output and is not nearly as unwieldy as the BSON operator statements required for an aggregation pipeline.
Upvotes: 1