SteveEdson
SteveEdson

Reputation: 2495

Improving performance of spatial MySQL query

I have a query that returns all records, ordered by distance from a fixed point, compared to a POINT field in my MySQL 5.7 database.

For a simple example, lets say it looks like this:

SELECT shops.*, st_distance(location, POINT(:lat, :lng)) as distanceRaw 
FROM shops 
ORDER BY distanceRaw
LIMIT 50

My actual query also has to do a few joins to get additional data for the results.

The issue is, that in order to sort the data by distance, it needs to calculate the distance over every single record in the database (currently around 100,000 records).

I can't cache the query, as it would only be specific to those original coordinates.

Is there anyway to limit the data that has to be calculated? E.g a reliable rough calculation for nearby shops, say +/- 3 degrees for the lat + lng? So that it only has to process a subset of the data?

If anyone has any experience in this sort of optimisation, I'd love some advice, thanks.

Upvotes: 5

Views: 2110

Answers (1)

Shadow
Shadow

Reputation: 34304

Yes, you can use some simple approximation in where criteria to filter out those locations that are obvioulsy out of the radius. This great blog post titled "Fast nearest-location finder for SQL (MySQL, PostgreSQL, SQL Server)" describes such optimizations:

Remember, from our background information earlier in this article, that a degree of latitude is 111.045 km. So, if we have an index on our latitude column, we can use a SQL clause like this to eliminate the points that are too far north or too far south to possibly be within 50 km.

latitude BETWEEN latpoint - (50.0 / 111.045) AND latpoint + (50.0 / 111.045)

This WHERE clause lets MySQL use an index to omit lots of latitude points before computing the haversine distance formula. It allows MySQL to perform a range scan on the latitude index.

Finally, we can use a similar but more complex SQL clause to eliminate points that are too far east or west. This clause is more complex because degrees of longitude are smaller distances the further away from the equator we move. This is the formula.

longitude BETWEEN longpoint - (50.0 / (111.045 * COS(RADIANS(latpoint)))) AND longpoint + (50.0 / (111.045 * COS(RADIANS(latpoint))))

So, putting it all together, this query finds the neareast 15 points that are within a bounding box of 50km of the (latpoint,longpoint).

The above describes the theoretical background to bounding rectangles.

Upvotes: 5

Related Questions