Stride
Stride

Reputation: 351

Algorithmic solution to find closest city based on lat/lon

Note there are other similar questions but 1) I don't want to rely on an online service, 2) I'm looking for a clean algorithmic solution.

I have a database of cities and their latitudes/longitudes. I'm looking for a way that, given an arbitrary lat/lon, finds the closest city.

Solutions I can think of so far:

  1. The obvious brute force solution is, of course, to calculate all the possible distances using the great-circle distance formula. This also takes a long time and is O(n).

  2. A modification of the KD-Tree algorithm may work, but I'm at a loss as to how to modify this algorithm to work in non-cartesian coordinates, as is the case for lat/lon. We can assume Mercator projection if this helps.

  3. Use a geo-database such as PostgreSQL. This doesn't work for me, period.

Any insights?

Upvotes: 5

Views: 6119

Answers (6)

mrtipale
mrtipale

Reputation: 959

I thought to make it a huge data structure in form of Balanced binary tree with balanced nodes and map.

Let's say root/head starts at London (chosen as center in horizontal usual map of world, GMT) : right: center city in right half side world map. left: center city in left half side world map.

Now propogate in same way to include all cities to include. x,y will be part of data. While iterating through each node, we can compare ourCityX and ourCityY coordinates.

I suppose, it would easy way to find closest left with left-right nodes travalsal along the tree downwards. I havn't implemented this but seems to good solution.


                  London
                /      \
            New York     Singapore
          /        \      /      \
      Florida     Cuba  Mumbai   Melborne

and so on...

Its based on distance and NOT based on diff of latitude OR longitude. Lets see if works.

Upvotes: 0

Adrian McCarthy
Adrian McCarthy

Reputation: 47954

You could think of it as a 3D problem. Convert (lat, lon) coordinates to (x, y, z) coordinates. This only needs to be done once for your database.

For each test point, convert to (x, y, z) and compute the squares of the chord distances (for speed and simplicity) to each city. Choose the closest one. I believe the closest city in 3-space will also be the closest city in terms of the great circle distance.

If you need the great circle distance, you can compute it for just the closest city.

With (x, y, z)-space, there's probably a spatial partitioning structure you can use to limit which cities you actually have to check. I don't think there's one that will help directly in (lat, lon)-space. But O(N) really isn't that bad. Also, the problem vectorizes pretty well.

Upvotes: 4

daroczig
daroczig

Reputation: 28622

I think you cannot save on computing the distance matrix between the cities, just use the Haversine formula formula for it. Computing the matrix is to be done only once, and you can use it later every time you need it without any complicated cast.

You might compute distance with MySQL also if you have no access to PostgreSQL, e.g. see this article on Google Code for details. The part, dealing with your problem could be summarized in the following SQL query:

SELECT id, ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM cities ORDER BY distance LIMIT 1;

Upvotes: 3

David Weiser
David Weiser

Reputation: 5195

Instead of using the features latitude and longitude, what about using latitude, longitude and distance from 0,0? Could you use a KD-tree, then?

Granted, it'd take time to compute the distance of every city to the origin, but you'd only have to do that once.

Upvotes: 0

Michael Borgwardt
Michael Borgwardt

Reputation: 346260

I've had this exact problem and solved it straightforwardly using an R-Tree, i.e. treating latitude/longitude as if they were cartesian coordinates.

This works reasonably well since there is a marked lack of cities along the International Date Line and around the poles, and my application can tolerate sometimes not getting exactly the nearest city.

Still I didn't like this imprecision and asked on SO. Someone suggested a solution that I think might work, though I have not found the time to implement it:

  • There should be a way to transform the coordinates such that the varying distances between the meridians is compensated for - this leaves the discontinuities at the 180° meridian and the poles to deal with.
  • Use two indexes with different coordinate systems: one regular one, and one that is rotated such that its 180° meridian is perpendicular and opposite of the one in the main coordinate system. This way, the discontinuities in one coordinate system are perfectly regular points in the other.

Upvotes: 0

Nikita Rybak
Nikita Rybak

Reputation: 68006

About 2: Just start at the range [-90..90, -180, 180].

The only nuance is, when you split whole range above into four smaller rectangles and calculate distance from the current point to each of them, you need to consider the possibility of 'overflow'. (that points (0, -179) and (0, 179) are closer than it seems from their longitudes)

I'm also assuming you don't have cities close to south/north pole.

Upvotes: 0

Related Questions