jcvd
jcvd

Reputation: 443

Query on distance

Not sure how to build this so opening up to the experts:

I have a list of customers in db that are looking for events within a radius of their location. I can store their zipcode (or lat/lng) and the max distance they will go for an event. So columns lat, lng, distance (ex: lat = '22.7447858', lng = '-82.1398589', distance = 25).

Events are posted all day and their zipcode/lat/lng is stored.

I want to run a query (once a day) that gets customers for a event. I was looking at CyberJunkies post Mysql within distance query but the problem is that I am running the query the opposite direction. I need to find customers whose 'circle distance' covers the current event, not the other way around. Not sure how to store the circle distance (are the 3 columns above good enough or is there a better way to store the data for this type of query)? Not sure how to query for customers per event.

Thanks in advance!

Upvotes: 1

Views: 558

Answers (5)

Qoopido
Qoopido

Reputation: 21

In addition to the other answers you might think about using cartesian coordinates (x, y and z) instead of lat/lng for db storage because the resulting query expressions are more simple in respect to load/time on db server than possible queries for lat/lng distance.

An example for a PHP implementation can be found under:

http://headers-already-sent.com/geodistance/

The method "getCartesian" will convert lat/lng to cartesian coordinates and the method "getDistanceByCartesian" shows how to calculate the actual distance. What you need to do is transfer this distance calculation from PHP to the SQL query (which should not be that complicated).

Edit, as I found the time to give a more practical example

Based on the class you can find under the above link I set up 2 demo-tables for my companies locations and all MC Donalds restaurants in our vicinity and converted lat/lng from Google Maps to cartesian x, y, z:

CREATE TABLE `locations` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(255) NOT NULL DEFAULT '',
  `lat` double NOT NULL,
  `lng` double NOT NULL,
  `x` double NOT NULL,
  `y` double NOT NULL,
  `z` double NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `locations` (`id`, `title`, `lat`, `lng`, `x`, `y`, `z`)
VALUES
    (1,'Ida-Ehre-Platz 10, 20095 Hamburg',53.55053,9.99949,3727600.05477,657242.251356,5124712.81705),
    (2,'Kieler Straße 191-193, 22525 Hamburg',53.57731,9.93686,3725956.4981,652753.812254,5126481.40905),
    (3,'Reeperbahn 42, 20359 Hamburg',53.549951,9.964937,3728046.74189,655003.113578,5124674.56664),
    (4,'Theodor-Heuss-Platz 3, 20354 Hamburg',53.56083,9.99038,3726797.15378,656489.722425,5125393.17725),
    (5,'Mundsburger Damm 67, 22087 Hamburg',53.57028,10.02642,3725550.98379,658686.623655,5126017.24553),
    (6,'Paul-Nevermann-Platz 1, 22765 Hamburg',53.552602,9.936678,3728135.78521,653123.397726,5124849.69505),
    (7,'Friedrich-Ebert-Damm 101, 22047 Hamburg',53.58753,10.08958,3723303.02881,662522.688778,5127156.05819),
    (8,'Amsinckstraße 73, 20097 Hamburg',53.54271,10.02654,3727978.07563,659123.791421,5124196.16112),
    (9,'Eiffestraße 440, 20537 Hamburg',53.55214,10.04638,3726919.13256,660267.521487,5124819.17553);


CREATE TABLE `user` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL DEFAULT '',
  `lat` double NOT NULL,
  `lng` double NOT NULL,
  `x` double NOT NULL,
  `y` double NOT NULL,
  `z` double NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `user` (`id`, `name`, `lat`, `lng`, `x`, `y`, `z`)
VALUES
    (1,'Ministry.BBS, Cremon 36, 20457 Hamburg',53.545943,9.988761,3728127.10678,656615.385203,5124409.77226),
    (2,'BBS, Dorotheenstraße 60, 22301 Hamburg',53.583231,10.008315,3724617.80169,657307.963226,5126872.28974);

Based on these two tables the SQL-Query to find all locations (restaurants) within a certain distance (2000, in meter in this example) to every user (our companies offices) would be:

SELECT locations.*,
    2 * 6371000.785 *
        asin(
            sqrt(
                pow(locations.x - user.x, 2)
                + pow(locations.y - user.y, 2)
                + pow(locations.z - user.z, 2)
            ) / (2 * 6371000.785)
        ) AS distance
    FROM locations, user
    HAVING distance < 2000 
    ORDER BY distance ASC

If you need something else than "meter" than you would have to change the earth radius of approx. 6371000.785 (in meter) to whatever you need and also change the desired distance of 2000 to whatever you like or is stored in your user table for each user.

Upvotes: 0

silvio
silvio

Reputation: 899

If your distance calculations are similar to ones in this solution, then you could do something like this:

select id1 from Distances 
    join EventTable on id2=EventTable.eventid 
    join UserTable on id1=UserTable.userid 
where type2=<EVENT_TYPE> and type1=<USER_TYPE> 
    and geodistance_km_by_obj(id1,<USER_TYPE>,id2,<EVENT_TYPE>) < UserTable.max_distance

Upvotes: 0

ean5533
ean5533

Reputation: 8994

I think there's two main ways of doing this: calculating distances on the fly, and pre-calculating distances once and then storing them in a lookup table.

Option 1, calculating on the fly. Tom van der Woerdt's answer does a good job explaining how you would do this. The pseudocode query is something like this:

SELECT * FROM customer, event WHERE (<calc distance>) < customer.distance

Option 2, pre-calculate all distances. You would create a table (call it distance for this example) that stores the distance between every customer and every event. It would have three columns: customerid, eventid, and miles (or whatever distance metric you want). Loop through every customer calculating the distance to every event and store each one in distance. Every time you add a new customer or event, you'd add corresponding records to the distance table. Once this structure is in place, finding events would be as simple as:

SELECT * FROM distance WHERE miles < [[some number you pick]]

So which one is better? It's a tradeoff between CPU time and disk space, so the answer depends on your resources. Option 1 (on-the-fly calculations) are going to require more work by the DBMS (more CPU time). As the amount of people and events increases, that query will take longer to run. Option 2 (pre-calculating distances) will make the lookups VERY fast, but the tradeoff is that you have to store all those pre-calculated distances on disk. You also need to be diligent about making sure your lookup table is up-to-date. Every time a customer or event is added, deleted, or has their lat/long changed then you need to update your lookup table correspondingly. Triggers can help you make this process automatic; just make sure you try testing every scenario (adding, deleting, moving) to make sure that the lookup table gets updated like it's supposed to.

Short answer: pick option 1 (calculate on the fly) if you have very small load on your database and/or you're limited on disk space. Pick option 2 if you have a heavy load but disk space is abundant. Option 2 is the more likely scenario, and it is much more scalable.

Upvotes: 2

Marc B
Marc B

Reputation: 360562

Distance from point A to point B is going to be the same as distance from point B to point A (unless you're dealing with road directions and different paths).

Basically you'd be doing (in sql pseudo-code)

SELECT distance(event_loc, user_loc) <= user_max_distance

Upvotes: 0

Tom van der Woerdt
Tom van der Woerdt

Reputation: 29965

What you want is:

SELECT * FROM customer, event WHERE (<calc distance>) < customer.distance

This will simply get all customers and all events, combine them together to get all possible combinations (100 customers and 10 events gives 1000 combinations) and then check whether they're in range. *

I'd personally recommend making a DISTANCE(customer,event) function that calculates it for you. It's easier to manage the query that way, and you can re-use it.

*Not necessarily in that order

Upvotes: 0

Related Questions