Reputation: 4010
From this question, I'd like to decide whether I should use GeoDjango, or roll my own with Python to filter Points within a certain radius of another Point.
There are two excellent answers that take different approaches to the question of how to perform such a calculation here: Django sort by distance
One of them uses GeoDjango to perform the distance calculation in PostGIS. I'm guessing that the compute would be done on the RDS instance?
The other uses a custom manager to implement the Great Circle distance formula. The compute would obviously be done on the EC2 instance.
I would imagine that the PostGIS implementation is more efficient because it's likely that people much smarter than I have optimized it. To what extent have they optimized it? Is there anything special about their implementation?
Assuming I am correct in assuming GeoDjango performs the distance compute using PostGIS on the RDS instance, I would imagine that RDS is not suited for heavy compute tasks, and may end up being slower or more expensive in the end. Are my assumptions correct?
What if I don't need a precise distance, where an octaggon or even a square would suffice? In the case of a square, it would be simply a matter of filtering Points with latitude and longitude within a certain range. Is GeoDjango/PostGIS able to perform estimates like this?
If I do need a precise distance, I could calculate the furthest bounds that can be reached with the given radius, and only perform precise distance calculations on Points within those bounds. Does GeoDjango/PostGIS do this?
Upvotes: 0
Views: 203
Reputation: 3484
I'll try to address you questions:
One of them uses GeoDjango to perform the distance calculation in PostGIS. I'm guessing that the compute would be done on the RDS instance?
If you are bringing two django models to memory, and doing the calculation using Django, such as
model_a = Foo.objects.get(id=1)
model_b = Bar.objects.get(id=1)
distance = model_a.geometry.distance(model_b.geometry)
This will be done in Python, using GEOS.
There are distance lookups on Django, such as
foos = Foo.objects.filter(geometry__distance_lte=(Point(0,0,srid=4326), km1))
This calculation will be done by the backend (aka database).
The other uses a custom manager to implement the Great Circle distance formula. The compute would obviously be done on the EC2 instance.
I would imagine that the PostGIS implementation is more efficient because it's likely that people much smarter than I have optimized it. To what extent have they optimized it? Is there anything special about their implementation?
Django has methods to use GCD in queries. This requires a transformation on the PostGIS, if you geometry field, to geography fields. Only EPSG:4326 is supported for now. If that's all you need, I bet the PostGIS implementation is good enough for almost all applications (if not all).
Assuming I am correct in assuming GeoDjango performs the distance compute using PostGIS on the RDS instance, I would imagine that RDS is not suited for heavy compute tasks, and may end up being slower or more expensive in the end. Are my assumptions correct?
I don't know much about amazon products, but without an estimate of size (number of rows, types of calculations (cross-product, for example), etc), it's hard to help further.
What if I don't need a precise distance, where an octaggon or even a square would suffice? In the case of a square, it would be simply a matter of filtering Points with latitude and longitude within a certain range. Is GeoDjango/PostGIS able to perform estimates like this?
What kind of data do you have? There are several components in calculating distances and areas, mainly the spatial reference that you use (datum, ellipsoid, projection).
IF you need to do accurate or more accurate distance measurements between two distance sides of the globe, the geography side is more precise and it will yield good results. If you need to do that kind of measurements in a Cartesian plane, your data will yield bad results.
If your data is local, like a few sq km, consider using a more local spatial reference. WGS84 4326 is more suitable for global data. Local spatial references can give you precise results, but in much smaller extents.
If I do need a precise distance, I could calculate the furthest bounds that can be reached with the given radius, and only perform precise distance calculations on Points within those bounds. Does GeoDjango/PostGIS do this?
I think you are optimizing too early. I know your question is a bit old, but this is something that you should only care when it starts to hurt. PostGIS and Django have been grinding a lot of data for a long time for me in a govn. system that checks land registry parcels and does tons of queries to check several parameters. It's working for a few years without a hitch.
Upvotes: 1