Searay330
Searay330

Reputation: 143

Optimizing an sql query for efficiency

I have a query dealing with geo distances. The query is super fast, returning in roughly .1175 seconds on my 2.24 million row table. However, I only need the lowest distance, and using the built-in order by is way too slow.

Is there any way to just keep track of a running minimum and just give me that?

For example, if I have these results:

city a - 45km
city b - 48km
city c - 12km 

can I just have it give me the 12km, keeping in mind that all the distance values are calculated?

Here is the query that sorts:

SELECT 
City, 
( 
    6371 * 
    acos(
        cos(radians(-60.61384878636903)) * 
        cos(radians(st_x(location))) * 
        cos(radians(st_y(location)) - 
        radians(112.80061386895574)) + 
        sin(radians(-60.61384878636903)) * 
        sin(radians(st_x(location))))
    ) as distance 
FROM table_name 
HAVING distance <  5 
ORDER BY distance ASC LIMIT 1 

The table structure is as follows:

id - int(12)
location - Point()
City - varchar(255) 

The problem lies in that the order by flag takes way too long to sort the data and get the lowest. Can it just keep a running minimum and then just give me that without a major performance hit?

the table contain,

2227851 - rows
spatial index on location

the run times i am getting are around 14 seconds if i use order by if i don't use order by i am getting times of .1 seconds which is my desired run time or close to it

Upvotes: 1

Views: 102

Answers (2)

spencer7593
spencer7593

Reputation: 108420

No, there's not really a way to keep a running minimum, at the query level.

The basic problem is that the minimum distance is going to be different for different values of latitude and longitude, the search coordinates being supplied as literals in the query.

One option is to keep a table of previous search coordinates, ones you've looked up before, and then use that to short circuit the need to do another query. First do a search of the table of previous searches, and get the result from there.

Of course, if you add a row to table_name, you'd probably need to re-evaluate the saved search coordinates against the new row, and determine if the new row is a shorter distance than what you have saved. (Or just invalidate the entire store, and repopulate for each search you do.)


The basic problem is that the "great-circle distance" expression has to be evaluated for every row in table_name.

The result of that expression is going to differ for different search values (latitude and longitude).

There's no getting around doing that calculation for every row, and out of all those results, finding the lowest value. That's gonna be a "Using filesort" operation. With the LIMIT 1, we'd hope that MySQL doesn't have to sort the entire set, and that it can just take one pass to identify the smallest value.

If you can limit the number of rows from table_name that need to be evaluated, and you can exclude them efficiently using an index... that would speed up the query.

One way to limit the number of rows would be to define a "bounding box", based on the search latitude and longitude. And specify that in the WHERE clause. And have MySQL use an appropriate index. The roughest bounding box could be defined as +/-dx degrees of latitude and +/-dy degrees of longitude from the search coordinates... e.g.

 WHERE  t.lat BETWEEN  -60.613848 -4 AND -60.613848 +4 
   AND  t.lon BETWEEN  120.800613 -8 AND 120.800613 +8 

That's not an ideal bounding box, because degrees of longitude are much longer distance at the equator than they are near the poles.


As far as keeping "running minimum"... can't be done with your current query. And it can't be done without some other data store that is somehow keyed to the search parameters.

Upvotes: 1

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48197

Recomendations:

But if you dont want use it

  • Precalculate constants Set A = cos(radians(-60.61384878636903)) cos function are very slow.
  • Filter your sample data. If your point of origin is X,Y you can create a square X +- 5, Y +- 5 and use regular index on X,Y

Upvotes: 3

Related Questions