user6611764
user6611764

Reputation:

How to improve query performance that compute haversine formula?

Given a table of locations with latitudes and longitudes, which of those locations are nearest to a given location?

Of course, finding distances on the surface of the earth means using Great Circle distances, worked out with the Haversine formula, also called the Spherical Cosine Law formula.

I have the following code:

 SELECT zip, latitude, longitude, distance
 FROM (
  SELECT z.zip,
         z.latitude, z.longitude,
         p.radius,
         p.distance_unit
             * DEGREES(ACOS(COS(RADIANS(p.latpoint))
             * COS(RADIANS(z.latitude))
             * COS(RADIANS(p.longpoint - z.longitude))
             + SIN(RADIANS(p.latpoint))
             * SIN(RADIANS(z.latitude)))) AS distance
  FROM zip AS z
   JOIN (   /* these are the query parameters */
    SELECT  42.81 AS latpoint, -70.81 AS longpoint,
            50.0  AS radius,  111.045 AS distance_unit
        ) AS p ON 1=1
  WHERE z.latitude
    BETWEEN p.latpoint  - (p.radius / p.distance_unit)
        AND p.latpoint  + (p.radius / p.distance_unit)
   AND z.longitude
    BETWEEN p.longpoint - (p.radius / (p.distance_unit * COS(RADIANS(p.latpoint))))
        AND p.longpoint + (p.radius / (p.distance_unit * COS(RADIANS(p.latpoint))))
) AS d
WHERE distance <= radius

Is there any way to improve a performance of this query?

Is it necessary to use PostGIS to improve it or it's just a wrapper to my haversine formula?

Upvotes: 8

Views: 1757

Answers (3)

l.lijith
l.lijith

Reputation: 432

This query will never be especially fast. However, there are some ways it can be improved.

First: The Haversine formula is not necessary here. The corrections which it applies are only necessary when the curvature of the earth is a significant factor, or very near the poles. Neither of these are the case here -- the largest distance which needs to be calculated accurately is 12 miles, which is barely even over the horizon. On this scale, the earth is effectively flat, so the Pythagorean Theorem is good enough for calculating distances.

One degree of latitude is about 69 miles, and at 52°N (around where the Netherlands are), a degree of longitude is cos(52°) x 69 = 42.5 miles, so the formula becomes:

sqrt(pow(69*(lat - $latitude), 2) + pow(42.5*(lng - $longitude), 2))

Second: we can use a "scissor test" for latitude and longitudes. If a point is more than 12 miles in any cardinal direction from your target point, it certainly cannot be within a 12-mile circle of that point. We can use this fact to perform a fast comparison on the latitude and longitude, skipping the distance calculation entirely. Using the figures for one degree of latitude/longitude we derived above, we have:

WHERE (lat BETWEEN ($latitude - 12/69.0) AND ($latitude + 12/69.0)) AND (lng BETWEEN ($longitude - 12/42.5) AND ($longitude + 12/42.5))

Note that this doesn't replace the full distance check! It's simply a first step to quickly throw out points that can't possibly be within the right radius. With an index in place on lat or lng, this will allow the database server to avoid examining many of the rows in the database.

Upvotes: 5

Rick James
Rick James

Reputation: 142433

The expressions are not the slow part. The problem with "finding nearest" is the difficulty in using an index to limit how many of rows to look at.

If you don't already have these on z, then they will help:

INDEX(latitude),
INDEX(longitude)

If you already had them, then make sure that one of them was actually used by the subquery.

The next step will be more drastic (and more fruitful): http://mysql.rjweb.org/doc.php/latlng

Upvotes: 0

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125444

I guess the planner will work this query rewrite all out by himself but worth a try. At least it is neater.

select zip, latitude, longitude, distance
from (
    select z.zip,
           z.latitude, z.longitude,
           p.radius,
           p.distance_unit
               * p.degrees_acos_cos_radians_latpoint
               * cos(radians(z.latitude))
               * cos(radians(p.longpoint - z.longitude))
               + p.sin_radians_latpoint
               * sin(radians(z.latitude)))) as distance
    from
        zip z
        cross join (
            select
                latpoint, longpoint, radius, distance_unit,
                latpoint - radius / distance_unit as lat0,
                latpoint + radius / distance_unit as lat1,
                longpoint - radius / distance_unit * cos(radians(latpoint)) as long0,
                longpoint + radius / distance_unit * cos(radians(latpoint)) as long1,
                sin(radians(latpoint)) as sin_radians_latpoint,
                degrees(acos(cos(radians(latpoint)) as degrees_acos_cos_radians_latpoint
            from (
                values (42.81, -70.81, 50.0, 111.045)
            ) v (latpoint, longpoint, radius, distance_unit)
        ) p
    where
        z.latitude between lat0 and lat1
        and
        z.longitude between long0 and long1
) d
where distance <= radius

Upvotes: 1

Related Questions