Reputation: 6097
I have a list of 1 million (slowly) moving points on the globe (stored as latitude and longitude). Every now and then, each point requests a list of the 100 nearest other points (with a configurable max range, if that helps).
Unfortunately, SELECT * SORT BY compute_geodetic_distance() LIMIT 100
is too slow to be done by each point over and over again. So my question: how should I handle this efficiently? Are there better algorithms/datastructures/... known for this? Or is this the only way and should I look into distributing server load?
(Note: this is for an Android app and the points are users, so in case I'm missing an android-specific solution, feel free to say so!)
Upvotes: 1
Views: 1433
Reputation: 12592
Instead of r-tree or a quadtree, I.e spatial index you can also use a quadkey and a monster curve. This curve reduce the dimension and completetly fills the space. You can download my php class hilbert curve from phpclasses.org. You can use a simple varchar column for the quadkey and search the levels from left to right. A good explanation is from Microsoft Bing maps quadkey website.
Upvotes: 0
Reputation: 11532
You have to divide the earth into zones and then use an interior point algorithm to figure out what zones the phone is in. Each possible subset of zones will uniquely determine the 100 closest nodes to a fair approximation. You can get an exact set of 100 nodes by checking distance one by one against the candidate nodes, which (once again) are determined by the subset of zones.
Upvotes: 0
Reputation: 28737
For your task geo spatial databases have been invented.
There is Oracle Spatial (expensive) and PostGres (free).
These databases store your millions points in a geographical index, a quad tree (Oracle).
Such a query needs nearly no time.
Some people, like me prefer to leave the database away and build up the quadtree themselfs.
The operations search and insert are easy to implement. Update/delete can be more complex.(Cheapest related to implementation effort, is to build up a new quadtree evry minute)
Using a quadtree you can perform hundreds or thousansds of such nearest 100 points within a second.
Upvotes: 1
Reputation: 4591
Architecturally I would arrange for each "point" to phone home to a server with their location when it changes more than a certain amount. On the server you can do the heavy lifting of calculating the distance between the point that moved and each of the other points, and for each of the other points updating their list of the 100 closest points if required. You can then push changes to a point's closest 100 list as they happen (trivial if you are using App Engine, Android push is supported).
This reduces the amount of work involved to an absolute minimum:
There are algorithms that you can use to make this super-efficient, and the problem has a fork/join feel to it as well, allowing you to throw horsepower at the problem.
Upvotes: 0