Reputation: 3393
I'm collecting location information from different sources and storing everything in a MongoDb collection. Apart from point locations with a single lat/lng coordinates, I'm also storing areas.
Now, one data gives me the location information as GeometryCollection
but with all elements being Polygons
. Another data source gives me the location as MultiPolygon
. While I'm actually considering have a collection for each data source, I'm wondering which approach is better in the whole.
GeometryCollection
is certainly more flexible, but maybe MultiPolygon
shows better query performance (given that I always create a 2dspehere
index over the location field). Is it worth it to convert one representation into the other?
Upvotes: 3
Views: 3785
Reputation: 7578
Good news: query performance and indexability are the same in MongoDB for all supported GeoJSON types.
The main driver in your decision should be whether your info architecture for the geo field and the software that consumes it needs to contain more types than just polygons. You say you're storing point locations? If you want to hold all geo data in a single field e.g. location
(and likely with a 2dsphere index on that) then you will need GeometryCollection
into which you can put Point
and the MultiPolygon
. It is recommended in the GeoJSON spec https://www.rfc-editor.org/rfc/rfc7946#page-9 not to nest GeometryCollection
so for those data sources giving you a GeometryCollection
, you would iterate the contents and populate your own GeometryCollection
which also holds your Point
s etc.
If you are storing points separately, e.g. eventCenter
as separate from eventAreasEffected
, then the eventCenter
can be just a Point
and the eventAreasEffected
can be a single 'MultiPolygon'; no need for GeometryCollection
. It is perfectly fine to have geo in more than one field, and to have or not have multiple 2dsphere indexes on these fields. Starting in MongoDB 4.0, you can use $geoNear
on a collection that has more than one 2dsphere index by including the key
option.
Here's an unofficial but reasonable definitional approach: A MultiPolygon
is not an arbitrary collection of Polygon
but rather a single "shape concept" that happens to have disjoint polygons. The United States can be described in a single MultiPolygon
that has Alaska, Hawaii, the continental US, maybe Puerto Rico, etc. In fact, to this end, you'll note that it is a little trickier to store data relevant to each member of the MultiPolygon
because coordinates
can only be an array of arrays of points. Information about the third polygon, for example, has to be carried in a peer field to the single toplevel coordinates
field. But a discrete array of Polygon
or a GeometryCollection
of Polygon
can store extra information in each shape. Note that neither GeoJSON nor MongoDB restrict you from adding fields in addition to type
and coordinates
for each shape.
A more subtle issue is the design and semantics of a GeometryCollection
of Polygon
vs. MultiPolygon
. To further complicate it, there is the issue of explicit holes defined in the Polygon
vs. a collection of implicitly "layered" Polygon
that are post-processed outside of the DB by geo software.
Upvotes: 3
Reputation: 1058
the problem with this subject is that there isnt a good answer. its all about what you will prefer or need. here is a great answer written on stackExchange.
Polygon vs MultiPolygon https://gis.stackexchange.com/questions/225368/understanding-difference-between-polygon-and-multipolygon-for-shapefiles-in-qgis
and i dont know about GeometryCollection so cant tell you anything about that. but this link will reveal alot of information to you.
Upvotes: 0