ProgrammerGirl
ProgrammerGirl

Reputation: 3223

How to make ORDER BY in JOIN query faster? Nothing I have tried has worked

I have the following JOIN query:

SELECT
    table1.*, 
    table2.*
FROM 
    Table1 AS table1 
LEFT JOIN 
    Table2 AS table2 
USING 
    (col1)
LEFT JOIN 
    Table3 as table3 
USING 
    (col1) 
WHERE 
    3963.191 * 
    ACOS(
    (SIN(PI() * $usersLatitude / 180) * SIN(PI() * table3.latitude / 180)) 
    +
    (COS(PI() * $usersLatitude / 180) * COS(PI() * table3.latitude / 180) * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180))
    ) <= 10 
AND 
    table1.col1 != '1' 
AND 
    table1.col2 LIKE 'A' 
AND 
    (table1.col3 LIKE 'X' OR table1.col3 LIKE 'X-Y') 
AND 
    (table2.col4 = 'Y' OR table2.col5 = 'Y') 


// Data Types of all columns in the query:
// col1: int(11)
// col2: char(1)
// col3: varchar(3)
// col4: char(1)
// col5: char(1)
// col6: int(11)
// latitude: varchar(25)
// longitude: varchar(25)

// All 3 tables (table1, table2, and table3) are `MyISAM`.

It executes in under 0.15 seconds.

However, if I simply add:

ORDER BY 
    table1.col6 DESC 

It executes in over 3 seconds.

All columns in the query are indexed, including the table1.col6 used in the ORDER BY.

Here are the results of EXPLAIN EXTENDED WITHOUT ORDER BY:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  table1  ALL PRIMARY,col2,col3   NULL    NULL    NULL    140101  72.61   Using where
1   SIMPLE  table2  eq_ref  PRIMARY,col4,col5   PRIMARY 4   table1.col1 1   100 Using where
1   SIMPLE  table3  eq_ref  PRIMARY PRIMARY 4   table1.col1 1   100 Using where

And here are the results of EXPLAIN EXTENDED WITH ORDER BY:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  table1  ALL PRIMARY,col2,col3   NULL    NULL    NULL    140101  72.61   Using where; Using filesort
1   SIMPLE  table2  eq_ref  PRIMARY,col4,col5   PRIMARY 4   table1.col1 1   100 Using where
1   SIMPLE  table3  eq_ref  PRIMARY PRIMARY 4   table1.col1 1   100 Using where

What's strange is that I use ORDER BY DESC in several other queries on this site, and it doesn't slow it down anywhere near as much as with this particular query. There is something specific to this query that is causing it to significantly slow down with the ORDER BY.

I also did an ANALYZE TABLE on all 3 tables, and they all reported OK. I then replaced every LIKE in the query with = and it actually made the query WITHOUT the ORDER BY go from 0.2 seconds to 3 seconds. In other words, replacing LIKE with = makes the original query take just as long as adding ORDER BY! How is that possible, considering LIKE does more work than =? Perhaps therein lies the clue as to why the ORDER BY takes so long?

HERE'S WHAT I'VE TRIED SO FAR (UNSUCCESSFULLY):

1) Instead of SELECT table1.*, table2.*, I tried just SELECT table1.col1 and it still took over 3 seconds to complete.

2) I tried adding a composite index on col1, col2, col3, and col6 in Table1, but it did not improve the execution speed.

3) I tried this solution of making the query as a sub-query and then wrapping the ORDER BY outside it at the end, but it did not improve the execution speed.

4) I tried the following version of the query, but it did NOT improve anything, and actually made the query take over 3 seconds WITHOUT even the ORDER BY added to it (perhaps that provides yet another clue):

SELECT STRAIGHT_JOIN
      T1.*, 
      T2.*
   FROM 
      Table1 AS T1
         JOIN Table2 AS T2
            ON T1.Col1 = T2.Col1
            AND ( T2.Col4 = 'Y' OR T2.Col5 = 'Y' )
         JOIN Table3 as T3
            ON T1.Col1 = T3.Col1
            AND 3963.191 
               * ACOS(  (SIN(PI() * $usersLatitude / 180) * SIN(PI() * T3.latitude / 180)) 
                                + (  COS(PI() * $usersLatitude / 180) * COS(PI() * T3.latitude / 180) 
                                   * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180)
                        )   
                     ) <= 10 
   WHERE
          T1.Col2 LIKE 'A'
      AND ( T1.col3 LIKE 'X' OR T1.col3 LIKE 'X-Y') 
      AND T1.Col1 != '1'
   ORDER BY
      T1.Col6

// With the following composite indexes:
// On Table 1, index on ( Col2, Col3, Col1, Col6 )
// On Table 2, index on ( Col1, Col4, Col5 )

// Remember, all individual columns are already indexed.

...

How can I get this stubborn query to run fast WITH the ORDER BY? Or is that just not possible?


EDIT:

Results of SHOW CREATE TABLE for all 3 Tables:

CREATE TABLE `Table1` (
 `col1` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `col100` varchar(25) CHARACTER SET utf8 DEFAULT NULL,
 `col101` varchar(60) COLLATE utf8_bin DEFAULT NULL,
 `col102` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
 `col103` varchar(10) COLLATE utf8_bin DEFAULT '00000000',
 `col104` date NOT NULL,
 `col105` int(3) DEFAULT NULL,
 `col106` varchar(25) COLLATE utf8_bin DEFAULT NULL,
 `col107` varchar(20) COLLATE utf8_bin DEFAULT 'Blah',
 `col108` varchar(2) COLLATE utf8_bin DEFAULT 'No',
 `col109` varchar(15) COLLATE utf8_bin DEFAULT 'Blah',
 `col2` enum('A','B') COLLATE utf8_bin DEFAULT NULL,
 `col3` enum('A','B','A-B') COLLATE utf8_bin DEFAULT NULL,
 `col110` decimal(10,7) NOT NULL DEFAULT '0.0000000',
 `col111` decimal(10,7) NOT NULL DEFAULT '0.0000000',
 `col112` char(1) COLLATE utf8_bin DEFAULT 'N',
 `col113` char(1) COLLATE utf8_bin DEFAULT 'N',
 `col114` int(11) DEFAULT NULL,
 `col115` varchar(15) COLLATE utf8_bin DEFAULT 'Blah',
 `col6` int(11) DEFAULT NULL,
 `col117` varchar(45) COLLATE utf8_bin DEFAULT NULL,
 `col118` varchar(2) COLLATE utf8_bin NOT NULL,
 `col119` tinyint(2) NOT NULL,
 `col120` int(6) NOT NULL,
 `col121` varchar(7) COLLATE utf8_bin NOT NULL,
 `col122` varchar(6) COLLATE utf8_bin NOT NULL,
 `col123` char(1) COLLATE utf8_bin NOT NULL DEFAULT 'A',
 `col124` varchar(200) COLLATE utf8_bin NOT NULL,
 `col125` tinyint(4) NOT NULL,
 `col126` tinyint(1) NOT NULL,
 `col127` varchar(1) COLLATE utf8_bin NOT NULL DEFAULT 'A',
 `col128` tinyint(1) NOT NULL DEFAULT '0',
 `col129` smallint(5) unsigned NOT NULL,
 `col130` varchar(1) COLLATE utf8_bin NOT NULL DEFAULT 'A',
 `col131` int(11) NOT NULL,
 `col132` tinyint(1) NOT NULL,
 `col133` tinyint(1) NOT NULL,
 `col134` varchar(1) COLLATE utf8_bin NOT NULL,
 `col135` varchar(200) COLLATE utf8_bin NOT NULL,
 `col136` int(11) NOT NULL,
 `col137` int(10) unsigned NOT NULL,
 `col138` int(11) NOT NULL,
 `col139` tinyint(1) NOT NULL,
 `col140` tinyint(1) NOT NULL,
 `col141` tinyint(4) NOT NULL,
 `col142` varchar(25) COLLATE utf8_bin NOT NULL,
 `col143` varchar(25) COLLATE utf8_bin NOT NULL,
 `col144` tinyint(1) unsigned NOT NULL,
 `col145` tinyint(4) NOT NULL,
 PRIMARY KEY (`col1`),
 KEY `col2` (`col2`),
 KEY `col3` (`col3`),
 KEY `CompositeIndex0` (`col1`,`col2`,`col3`,`col6`),
 KEY `CompositeIndex1` (`col2`,`col3`,`col1`,`col6`),
 KEY `idx01` (`col1`,`col2`,`col3`)
 [19 other indexes that do not involve col1, col2, col3, or col6...]
) ENGINE=MyISAM AUTO_INCREMENT=160640 DEFAULT CHARSET=utf8 COLLATE=utf8_bin

//*******************************************************//

CREATE TABLE `Table2` (
 `col1` int(11) unsigned NOT NULL DEFAULT '0',
 `col201` varchar(45) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT 'Blah',
 `col202` varchar(45) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT 'Blah',
 `col203` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col204` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col205` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col206` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col207` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col208` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col209` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col210` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col211` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col212` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col213` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col214` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col215` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col216` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col217` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col218` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col219` varchar(45) COLLATE utf8_bin DEFAULT 'Blah',
 `col220` varchar(255) COLLATE utf8_bin DEFAULT 'Blah',
 `col221` varchar(255) COLLATE utf8_bin DEFAULT 'Blah',
 `col222` varchar(255) COLLATE utf8_bin DEFAULT 'Blah',
 `col223` varchar(255) COLLATE utf8_bin DEFAULT 'Blah',
 `col224` varchar(45) COLLATE utf8_bin DEFAULT ‘Blah’,
 `col225` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 `col4` char(1) COLLATE utf8_bin DEFAULT 'A',
 `col226` char(1) COLLATE utf8_bin DEFAULT 'A',
 `col227` varchar(5) COLLATE utf8_bin DEFAULT 'Blah',
 `col228` char(1) COLLATE utf8_bin NOT NULL,
 `col229` text COLLATE utf8_bin,
 `col5` char(1) COLLATE utf8_bin DEFAULT 'A',
 `col230` varchar(255) COLLATE utf8_bin DEFAULT 'Blah',
 `col231` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 `col232` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 `col233` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 PRIMARY KEY (`col1`),
 KEY `col4` (`col4`),
 KEY `col5` (`col5`),
 KEY `CompositeIndex1` (`col1`,`col4`,`col5`),
 [4 other indexes not involving col1, col4, col5...]
 FULLTEXT KEY `col220` (`col220`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin

//*******************************************************//

CREATE TABLE `Table3` (
 `col1` int(11) unsigned NOT NULL DEFAULT '0',
 `col300` varchar(255) COLLATE utf8_bin DEFAULT NULL,
 `latitude` varchar(25) COLLATE utf8_bin NOT NULL DEFAULT '0',
 `longitude` varchar(25) COLLATE utf8_bin NOT NULL DEFAULT '0',
 `col301` int(11) DEFAULT NULL,
 `static2` float(18,16) DEFAULT '0.0000000000000000',
 `static3` float(18,16) DEFAULT '0.0000000000000000',
 PRIMARY KEY (`col1`),
 KEY `latitude` (`latitude`),
 KEY `longitude` (`longitude`),
 KEY `static2` (`static2`),
 KEY `static3` (`static3`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin

EDIT 2:

Below is my MySQL Configuration File. Among other things, please notice how the sort-buffer-size is set to 1M. According to this, it should not be set above 256K or it can actually slow things down by "37x". Could that be part of the problem?

# The MySQL database server configuration file.

[mysqld]

open-files-limit                = 20000

thread-cache-size               = 16
table-open-cache                = 2048
table-definition-cache          = 512

query-cache-type                = 1
query-cache-size                = 32M
query-cache-limit               = 1M

sort-buffer-size                = 1M
read-buffer-size                = 1M
read-rnd-buffer-size            = 8M
join-buffer-size                = 1M

tmp-table-size                  = 64M 
max-heap-table-size             = 64M

back-log                        = 100
max-connections                 = 200
max-connect-errors              = 10000
max-allowed-packet              = 16M
interactive-timeout             = 600
wait-timeout                    = 180
net_read_timeout        = 30
net_write_timeout       = 30

back_log            = 128

myisam-sort-buffer-size         = 128M

innodb-buffer-pool-size         = 320M
innodb-log-buffer-size          = 4M

innodb-log-file-size           = 128M
innodb-log-files-in-group      = 2

innodb-file-per-table           = 1

[mysqldump]
max-allowed-packet      = 16M

On a different matter, here are the RESULTS OF EXPLAIN EXTENDED ON LATEST QUERY FROM IVAN:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  T1  ref PRIMARY,col2,col3,col1,CompositeIndex1,idx01    CompositeIndex1 2   const   92333   Using where; Using filesort
1   SIMPLE  T3  eq_ref  PRIMARY PRIMARY 4   T1.col1 1   Using where
1   SIMPLE  T2  eq_ref  PRIMARY,CompositeIndex1,idx_static1 PRIMARY 4   T1.col1 1   Using where

On a different matter, here is something VERY strange. The following version of the query WITH ORDER BY completes in just 0.2 seconds:

SELECT STRAIGHT_JOIN T1 . * , T2 . * 
FROM Table3 AS T3
JOIN Table2 AS T2 ON T3.col1 = T2.col1
AND (
T2.col4 = 'Y'
OR T2.col5 = 'Y'
)
JOIN Table1 AS T1 ON T3.col1 = T1.col1
AND 3963.191 * ACOS( (
SIN( PI( ) * - 87.8819594 /180 ) * SIN( PI( ) * T3.latitude /180 ) ) + ( COS( PI( ) * - 87.8819594 /180 ) * COS( PI( ) * T3.latitude /180 ) * COS( PI( ) * T3.longitude /180 - PI( )* 37.1092162 /180 ) )
) <=10
WHERE T1.col2 LIKE 'A'
AND (
T1.col3 LIKE 'X'
OR T1.col3 LIKE 'X-Y'
)
AND T1.col1 != '1'
ORDER BY T1.col6 DESC

Basically, this version of the query does a FROM Table3 AS T3 and JOIN tables 1 and 2 whereas the original query does FROM Table1 AS T1 and JOIN tables 2 and 3.

Here is the EXPLAIN EXTENDED for the query above:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  T3  ALL PRIMARY NULL    NULL    NULL    141923  100 Using where; Using temporary; Using filesort
1   SIMPLE  T2  eq_ref  PRIMARY,col4,col5,CompositeIndex1   PRIMARY 4   T3.col1 1   100 Using where
1   SIMPLE  T1  eq_ref  PRIMARY,col2,col3,col1,CompositeIndex1,idx01    PRIMARY 4   T2.col1 1   100 Using where

Notice how this query actually does BOTH a filesort and a temporary vs. just a filesort on the original and new queries from Ivan. How can that be 10x faster?

Even stranger, switching the order of the JOIN does not seem to improve neither the original query nor the newer queries from Ivan. Why is that?

Upvotes: 21

Views: 2758

Answers (5)

Ivan Buttinoni
Ivan Buttinoni

Reputation: 4145

Well, I suggest you some restyles of the query:

  1. put in where conditions not join related, see 2nd query:

    AND ( T1.col3 LIKE 'X' OR T1.col3 LIKE 'X-Y')

  2. avoid OR use IN

  3. avoid like use =

    AND T1.col3 IN ( 'X' , 'X-Y')

  4. avoid computation in where

create some new columns to store:

SIN(PI() * T3.latitude / 180)
COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180)
COS(PI() * T3.latitude / 180) 
  1. pre-evaluate

    SIN(PI() * $usersLatitude / 180) COS(PI() * $usersLatitude / 180)

  2. if all these "tricks" can't avoid file-sort force the indexes

mysql query index hint

FURTHER ADD

in order to remove:

( T2.Col4 = 'Y' OR T2.Col5 = 'Y' )

in this case you can't use IN, so create a new column that is the result of this expression.

alter table table2 add static1 bit default 0;
alter table add index idx_static1(static1);
update table2 t2 set static1=1 where ( T2.Col4 = 'Y' OR T2.Col5 = 'Y' );

alter table table3 add static2 float(18,16) default 0;
update table3 set static2=SIN(PI() * T3.latitude / 180) where 1

alter table table3 add static3 float(18,16) default 0;
update table3 set static3 = COS(PI() * T3.latitude / 180) * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180)   where 1

If table1.col2 has few values

alter table table1 change col2 col2 enum('A','B','C');

If table1.col3 has few values

alter table table1 change col3 col3 enum('X','Y','X-Y');

Create an unique index for all the columns involved in where alter table add index idx01 (col1,col2,col3)

SELECT STRAIGHT_JOIN
      T1.*, 
      T2.*
   FROM 
      Table1 AS T1
         JOIN Table2 AS T2 ON T1.Col1 = T2.Col1
         JOIN Table3 as T3 ON T1.Col1 = T3.Col1

   WHERE static1=1 AND
          T1.Col2 = 'A'
      AND T1.col3 IN ( 'X', 'X-Y') 
      AND T1.Col1 != 1
      AND ACOS(  
                 ( 
                   $usersLatitude_sin_pi_fract180  * t3.static2 
                   + $usersLatitude_cos_pi_fract180  * t3.static3 
                 )   
               ) <= 0,00252321929476 -- this's 10/3963.191
      ORDER BY T1.Col6

Your comment suggest me that you've a different collation in the query (col1 is latin1_swedish and col2 is utf8) or your connection use a different collation (your connection is utf-8 and you query a latin1_german column) so when you query:

t1.col2 = 'A'

Mysql has to convert from utf-8 to latin1 each value.

See also the collate section of mysql documentation.

A quick way is to convert all (column, table, server, connection, client) to the same collation singel byte will be better if you don't need utf-8.

Be carefull to my type error or syntax error I could done.

FURTHER ADD 2

I recreated the tables on a test DB and I fixed this columns: t1.col2, t2.col3 must not nullable, t1.col1 is primary and can't be null.

Index "t1.CompositeIndex1" should index only: col2, col3, col1; Index an "order by" column is unusefull or worst.

I create static1 and I create an index on t2.col1 and t2.static1 but with my 6 rows in DB is not used (see explain later). t2.static1 must not nullable too.

I also adapt the query to the collate of the columns:

SELECT  T1.*, T2.* 
FROM Table1 AS T1
         JOIN Table2 AS T2   ON ( T1.Col1 = T2.Col1   )
         JOIN Table3 as T3 ON T1.Col1 = T3.Col1
   WHERE  
         (  T1.Col2 =    'A'   collate utf8_bin  AND T1.col3 IN  ( 'X' collate utf8_bin , 'X-Y'  collate utf8_bin )   AND T1.Col1 != 1 )
and T2.static1=1
      AND ACOS(  (   2.3  * T3.static2  + 1.2 * T3.static3  ) ) <= 0.00252321929476 
      ORDER BY T1.Col6

Here follows the explain extended

+----+-------------+-------+--------+-----------------------------------+-----------------+---------+----------------+------+----------+-----------------------------+
| id | select_type | table | type   | possible_keys                     | key             | key_len | ref            | rows | filtered | Extra                       |
+----+-------------+-------+--------+-----------------------------------+-----------------+---------+----------------+------+----------+-----------------------------+
|  1 | SIMPLE      | T1    | ref    | PRIMARY,col2,col3,CompositeIndex1 | CompositeIndex1 | 1       | const          |    1 |   100.00 | Using where; Using filesort |
|  1 | SIMPLE      | T2    | eq_ref | PRIMARY,CompositeIndex1           | PRIMARY         | 4       | testdb.T1.col1 |    1 |   100.00 | Using where                 |
|  1 | SIMPLE      | T3    | eq_ref | PRIMARY                           | PRIMARY         | 4       | testdb.T1.col1 |    1 |   100.00 | Using where                 |
+----+-------------+-------+--------+-----------------------------------+-----------------+---------+----------------+------+----------+-----------------------------+ 

Is it the same for the colums: select_type, table, type, key, ref, filtered, Extra?

My optimization goals are: - fit the where conditions in few index - avoid calculations - avoid collate conversion - avoid OR - avoid NULLs in where condition

Now the bad news It seems that in the tables you've ~140K records, and query using order can imply the using of the filesort approach if the query involve a lot of rows, so the final answer can be increase the memsort buffer as suggest by @mavroprovato.

FURTHER ADD 3

To evauate the adequacy of key_buffer_size see on http://dba.stackexchange.com

FURTHER ADD 4

I think that only someone in Oracle can say exactly what happens, but I've my idea.

I think that this query is peculiar:

  1. all tables (t1,t2,t3) are joinded by primary key
  2. other conditions depends only by calcs (t3.colX )
  3. some conditions depend only by index (t1.colX )

Because of 1 from_table_rows >= join1_table_rows >= join2_table_rows, so less rows return the from table quickest will be the 2 other JOINs

The optimizer to evaluate the effort will calc a similar equation:

effort = num_rows*key_size/index_cardinality

(index_cardinality is show by phpmyadmin next each index)

Because of 2 effort is >= num_rows

My query because of 3 the the table1 (from table) returns 92333 rows, table3 (join1_table) reduce to 1(!) row, table2 keep 1 row (effort ~ 3).

Your query because of 2 you should have an effort = 140000, but luckylly for you the calc return just 1 results so your query is extremelly quick

Demostration

In your query changing from "<=10" (in join condition) to "<=1000" or more you will see a exponential decrease of performances.

In my query changing from "<=10" (in join condition) to "<=1000" or more you will see a linear/logarithm decrease of performances.

FURTHER ADD 5

answer to the question: is sort-buffer-size too big?

standing on article, yes, try some tune, may be you can solve the problem

answer to the question: is impossible to do a quick query?

IMHO no, it's possible (even if sort-buffer-size don't resolve).

My idea is quite simple, and it can resume in this mot: "cirlce is nice but square is better".

At the moment the biggest cardinality is on the coordinates in table3, but because of formula no index is applicable. So, instead of search all points inside a radius, you can search all points inside a "square"

FROM table3
...
WHERE (t3.latitude-0.15) < $usersLatitude AND  $usersLatitude < t3.latitude+0.15  
AND t3.longitue - 0.15 < $usersLongitude AND   $usersLongitude < t3.longitue + 0.15

so you can create an index in (t3.latitude,t3.longitue).

0.15 degrees should be 10 miles. Of course you should fix the calcs near the day-change meridian and next to the poles

If you need strictly a radius you can re-join table3 with the radius formula (see example below) or if possible carry out(/elaborate) the formula until you can compare directly values with columns.

FROM table3 t3
JOIN table3 t3bis ON t3.id=t3bis.id
...
WHERE (t3.latitude-0.15) < $usersLatitude AND  $usersLatitude < t3.latitude+0.15  
AND t3.longitue - 0.15 < $usersLongitude AND   $usersLongitude < t3.longitue + 0.15
AND 
3963.191 
* ACOS(  (SIN(PI() * $usersLatitude / 180) * SIN(PI() * t3bis.latitude / 180)) 
+ (  COS(PI() * $usersLatitude / 180) * COS(PI() * t3bis.latitude / 180) 
* COS(PI() * t3bis.longitude / 180 - PI() * 37.1092162 / 180)
)   
) <= 10 

FURTHER ADD 6

topic: compiled functions do it better

use of RADIANS() function

degree * PI / 180 == radians(degree)

Use of GIS extension of mysql

See this article about MySql GIS extension

Upvotes: 5

RolandoMySQLDBA
RolandoMySQLDBA

Reputation: 44343

There are three(3) things I see you can do:

1) Refactor the query

2) Apply the ORDER BY on Table1 earlier in the Query

3) Index Table1 to support refactor

Perhaps this...

ALTER TABLE Table1 ADD INDEX col2_col6_ndx (col2,col6);
SELECT
    table1.*, 
    table2.*
FROM 
    (
        SELECT * FROM Table1
        WHERE col2='A' AND 
        ORDER BY col6 DESC
    ) AS table1
LEFT JOIN 
    (
        SELECT * FROM Table2
        WHERE (col4='Y' OR col5='Y')
    ) AS table2 
USING 
    (col1)
LEFT JOIN 
    Table3 as table3 
USING 
    (col1) 
WHERE 
    table1.col1 != '1' AND
    table1.col3 IN ('X','X-Y') AND
    3963.191 * 
    ACOS(
    (SIN(PI() * $usersLatitude / 180) * SIN(PI() * table3.latitude / 180)) 
    +
    (COS(PI() * $usersLatitude / 180) * COS(PI() * table3.latitude / 180) * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180))
    ) <= 10 
;

Here is another variation that attempts to refactor the query so that only keys are gathered first (creating much smaller temp tables) and then JOINS are applied:

ALTER TABLE Table1 ADD INDEX col2613_ndx (col2,col6,col1,col3);
ALTER TABLE Table2 ADD INDEX col4_col1_ndx (col4,col1);
ALTER TABLE Table2 ADD INDEX col5_col1_ndx (col5,col1);
SELECT 
    table1.*, 
    table2.*
FROM
(
    SELECT table1.col1,table3.latitude,table3.longitude 
    FROM 
        (
            SELECT col1 FROM Table1 WHERE col2='A' AND
            AND col3 IN ('X','X-Y') ORDER BY col6 DESC
        ) AS table1
    LEFT JOIN 
        (
            SELECT col1 FROM Table2 WHERE col4='Y' UNION
            SELECT col1 FROM Table2 WHERE col5='Y'
        ) AS table2 
    USING (col1)
    LEFT JOIN Table3 as table3 USING (col1)
) col1_keys
LEFT JOIN Table1 table1 USING (col1)
LEFT JOIN Table2 table2 USING (col1)
WHERE 
    3963.191 * 
    ACOS(
    (SIN(PI() * $usersLatitude / 180) * SIN(PI() * col1_keys.latitude / 180)) 
    +
    (COS(PI() * $usersLatitude / 180) * COS(PI() * col1_keys.latitude / 180)
    * COS(PI() * col1_keys.longitude / 180 - PI() * 37.1092162 / 180))
    ) <= 10 
;

Give it a Try !!!

Upvotes: 0

ProgrammerGirl
ProgrammerGirl

Reputation: 3223

After much trial and error, I finally found the solution to my question.

If we put the entire WHERE clause -- except the part that calculates the radius -- outside the original query, then we get a very fast query that does not use temporary like changing the order of the JOIN does:

SELECT * FROM
{
    SELECT
        col1, col2, col3, col4, col5, col6
    FROM 
        Table1 AS table1 
    LEFT JOIN 
        Table2 AS table2 
    USING 
        (col1)
    LEFT JOIN 
        Table3 as table3 
    USING 
        (col1) 
    WHERE 
        3963.191 * 
        ACOS(
        (SIN(PI() * $usersLatitude / 180) * SIN(PI() * table3.latitude / 180)) 
        +
        (COS(PI() * $usersLatitude / 180) * COS(PI() * table3.latitude / 180) * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180))
        ) <= 10 
) AS sub
WHERE
    col1 != '1' 
AND 
    col2 LIKE 'A' 
AND 
    (col3 LIKE 'X' OR col3 LIKE 'X-Y') 
AND 
    (col4 = 'Y' OR col5 = 'Y') 
ORDER BY 
    col6 DESC 

Essentially, this query first gets the JOIN results of all 3 tables based on the radius, and only then applies the rest of the filters to get the results we need. This version of the query returns the exact same results as my original query, yet executes in just 0.2 seconds vs. over 3 seconds for my original query.

Here is the EXPLAIN EXTENDED for it:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   PRIMARY <derived2>  ALL NULL    NULL    NULL    NULL    43  100 Using where; Using filesort
2   DERIVED T3  ALL PRIMARY NULL    NULL    NULL    143153  100 Using where
2   DERIVED users   eq_ref  PRIMARY,col1,idx01  PRIMARY 4   T3.col1 1   100 
2   DERIVED userProfile eq_ref  PRIMARY,CompositeIndex1 PRIMARY 4   users.col1  1   100 

I wanted to thank Ivan Buttinoni for his excellent work on this. He found several clever ways of making this query even faster.

Moral of the Story: it's not just the ORDER BY clause that can be made faster by putting it outside the main query, you can also get a faster query by placing part of the WHERE clause outside it as well in situations such as this one.

Upvotes: 0

csiz
csiz

Reputation: 19

Try the first query:

...
FROM 
    Table1 AS table1 USE INDEX (col6)
LEFT JOIN 
    Table2 AS table2   
...

Upvotes: 1

Alexander Taver
Alexander Taver

Reputation: 474

What is the type of table1.col6? What its field diameter (max value length)?

BTW pre-calculate values, which do not depend on field values, like Ivan Buttinoni suggested. It will not help with ordering, but will make query faster

Upvotes: 0

Related Questions