AJ.
AJ.

Reputation: 2569

How to store distance between cities and towns in DB efficiently

I need to be able to display distance to n cities/towns from a particular location chosen by user. Its like clicking on a map and getting all destinations within 100 miles, only that it wont be a map but a link on webpage.

I need to choose a solution that would scale-up from within a state to a country to globally potentially - which means from thousand to hundred thousand locations.

I though of storing CITY1_ID, CITY2_ID & DISTANCE in a Relational DB table, but I doubt if it would scale well for a web application (million of rows).

Could this be done more efficiently using a NoSQL Database or Graph DB ? Or is RDBMS good enough for this problem with proper design?

Added: If I do not store in DB then how will I get something like: Get me all cities within 100 miles of San Jose?

Upvotes: 5

Views: 3930

Answers (7)

Geek Num 88
Geek Num 88

Reputation: 5312

Instead of calculating the distance between the 2 cities calculate a bounding box of 100 miles then you have 4 float variables to plug into your database - float comparision is a lot faster than distance calculations in the database. Downside is you get a little bit more distance in the corners.

PHP function to calculate bounding box

function getBoundingBox($lat_degrees,$lon_degrees,$distance_in_miles)
{
       $radius = 3963.1; // of earth in miles

        // bearings
        $due_north = 0;
        $due_south = 180;
        $due_east = 90;
        $due_west = 270;

        // convert latitude and longitude into radians
        $lat_r = deg2rad($lat_degrees);
        $lon_r = deg2rad($lon_degrees);

        // find the northmost, southmost, eastmost and westmost corners $distance_in_miles away
        // original formula from
        // http://www.movable-type.co.uk/scripts/latlong.html

        $northmost  = asin(sin($lat_r) * cos($distance_in_miles/$radius) + cos($lat_r) * sin ($distance_in_miles/$radius) * cos($due_north));
        $southmost  = asin(sin($lat_r) * cos($distance_in_miles/$radius) + cos($lat_r) * sin ($distance_in_miles/$radius) * cos($due_south));

        $eastmost = $lon_r + atan2(sin($due_east)*sin($distance_in_miles/$radius)*cos($lat_r),cos($distance_in_miles/$radius)-sin($lat_r)*sin($lat_r));
        $westmost = $lon_r + atan2(sin($due_west)*sin($distance_in_miles/$radius)*cos($lat_r),cos($distance_in_miles/$radius)-sin($lat_r)*sin($lat_r));

        $northmost = rad2deg($northmost);
        $southmost = rad2deg($southmost);
        $eastmost = rad2deg($eastmost);
        $westmost = rad2deg($westmost);

        //return 2 points NW corner and SE corner
        return array($northmost,$westmost,$southmost,$eastmost);
}

then your SQL is

SELECT * FROM table WHERE latitude <= $northmost AND longitude >= $westmost AND latitude >= $southmost AND longitude <= $eastmost

Upvotes: 2

Sameer
Sameer

Reputation: 4389

Here is a solution using RDBMS. Keep two tables

  • CityByLat { latitude, city_id } with clustered index on latitude and
  • CityByLng { logitude, city_id } with clustered index on longitude

When you need to find cities within a certain radius from a given latitude and longitude you can do an efficient range query on the two tables to get cities within a certain range of latitude and longitude. You can then calculate the actual distance from only the cities thus retrieved.

Upvotes: 1

nickhar
nickhar

Reputation: 20853

You could, as others have noted, store the Lat/Long coords for each entry and compute the distance using something similar to the following at runtime, which provides km/miles distance output:

function distance($lat1, $lng1, $lat2, $lng2, $miles = true)
{
        $pi80 = M_PI / 180;
        $lat1 *= $pi80;
        $lng1 *= $pi80;
        $lat2 *= $pi80;
        $lng2 *= $pi80;

        $r = 6372.797; // mean radius of Earth in km
        $dlat = $lat2 - $lat1;
        $dlng = $lng2 - $lng1;
        $a = sin($dlat / 2) * sin($dlat / 2) + cos($lat1) * cos($lat2) * sin($dlng / 2) * sin($dlng / 2);
        $c = 2 * atan2(sqrt($a), sqrt(1 - $a));
        $km = $r * $c;

        return ($miles ? ($km * 0.621371192) : $km);
}

EDIT: This is not suitable for n matches within a radius search. Given the density of towns/cities within a given radius, better to move the distance calculations into SQL as its far faster and you can match against those within x km/miles.

Upvotes: 0

JayC
JayC

Reputation: 7141

A simple solution I've used multiple times (but not with mysql) is create a user defined function some_distance_function with four parameters latitude1,longitude1,latitude2,longitude2 which returns the distance and then just test everything against that distance function and see for every item, whether or not the distance is less than or equal to a given value. If you are only going to have a few thousand locations, this is quite fine and efficient.

If you need to run this query against millions of records, you might want to see what GIS (Geography Information Systems) extensions are available for your database of choice, as there are better (at least in terms of search-ability) persistent data structures for searching though vast numbers of locations.

Edit: To give an example of how Microsoft does it, see http://technet.microsoft.com/en-us/library/bb964712(v=sql.105).aspx

It looks like MySQL supports spatial extensions in general:

http://dev.mysql.com/doc/refman/5.0/en/gis-introduction.html
http://dev.mysql.com/doc/refman/5.0/en/spatial-extensions.html

Edit II:

Looks like this question might also be helpful.

Find the distance between two points in MYSQL. (using the Point Datatype)

Upvotes: 1

stealthjong
stealthjong

Reputation: 11093

Don't store it, calculate it runtime with longitude and latitude. Extremely scalable, contrary to saving all distances between cities.

You have a reference point (San Jose) and loop through all your city records and calculating it runtime (in case of many records, have this calculation done by the client, probably with javascript or something, because if you have the server do it, it'll cost its toll all too soon). The JavaScript might look something like this:

var R = 6371; // Radius of the earth in km
var dLat = (lat2-lat1).toRad();  // Javascript functions in radians
var dLon = (lon2-lon1).toRad(); 
var a = Math.sin(dLat/2) * Math.sin(dLat/2) +
        Math.cos(lat1.toRad()) * Math.cos(lat2.toRad()) * 
        Math.sin(dLon/2) * Math.sin(dLon/2); 
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a)); 
var d = R * c; // Distance in km

Above code comes from here

Note: It's in kilometers as I'm Dutch and thus using the metric system

Upvotes: 0

kasi
kasi

Reputation: 774

I am using Neo4J for something similar, it scales really well for any type of data that can be represented as a graph.

Upvotes: 0

Randy
Randy

Reputation: 16677

you should store city_id, latitude, longitude one for each city - then calculate the distances based on runtime input.

Upvotes: 4

Related Questions