T. Brian Jones
T. Brian Jones

Reputation: 13522

Why do these MySQL Queries take such drastically different amounts of time to process when they seem equivalent?

Below are three MySQL Queries. The first returns everything I'm looking for in one query, while the second two return the same aggregate results, but in two queries.

I'm wondering why the single query takes 2-100s of times longer to run, when it seems like process time should be equivalent between all three queries. Is there a way to optimize the single query to run as fast as the individual queries? Adding more OR's inside the WHERE statement of the single query does not increase process time, but I have cases where I need to do many more ORs and eventually the single query would be as fast as running ten individual queries.

The single query also seems to get cached after it's run, and can take minutes to run the first time, while the single queries always complete within the same timeframe.

Would a multi-column index make a big difference here?

Of note is that the table does not have an ID field as a primary index. Is this causing this undesired behavior?

It's hard to run tests here as the table has a hundred million rows and adding columns and indexes takes close to a day.

SINGLE QUERY (4.2s)

SELECT name_id
FROM staging_company_search
WHERE
    (name_word_0 = 'the' AND name_word_1 = 'glazier')
    OR (name_word_0 = 'bridgewaters' AND name_word_1 = '');

EQUIVALENT AGGREGATE QUERIES (0.8s each)

SELECT name_id
FROM staging_company_search
WHERE name_word_0 = 'the' AND name_word_1 = 'glazier';

SELECT name_id
FROM staging_company_search
WHERE name_word_0 = 'bridgewaters' AND name_word_1 = '';

EXPLAIN ON THESE QUERIES

id     select_type    table                     type    possible_keys              key          key_len  ref    rows     extra
1      SIMPLE         staging_company_search    range   name_word_0,name_word_1    name_word_0  102      NULL   2197605  Using index condition; Using where
1      SIMPLE         staging_company_search    ref     name_word_0,name_word_1    name_word_1  102      const  128      Using index condition; Using where
1      SIMPLE         staging_company_search    ref     name_word_0,name_word_1    name_word_0  102      const  33       Using index condition; Using where

DATABASE SCHEMA

CREATE TABLE `staging_company_search` (
  `name_id` int(11) unsigned NOT NULL DEFAULT '0',
  `name_word_0` varchar(100) NOT NULL,
  `name_word_1` varchar(100) NOT NULL,
  KEY `name_id` (`name_id`),
  KEY `name_word_0` (`name_word_0`),
  KEY `name_word_1` (`name_word_1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Upvotes: 5

Views: 76

Answers (3)

Ismael Padilla
Ismael Padilla

Reputation: 5566

I believe that this is because of how MySQL handles the indexes where it must check for several conditions ("OR"). This is seen in the "explain query", where the first query is expected to need to check more rows before returning the result.

I believe that the union of two small results should achieve a better performance. Can you try the following?

SELECT name_id
FROM staging_company_search
WHERE (name_word_0 = 'the' AND name_word_1 = 'glazier')
UNION ALL    
    SELECT name_id
    FROM staging_company_search
    WHERE (name_word_0 = 'bridgewaters' AND name_word_1 = '');

Upvotes: 2

e4c5
e4c5

Reputation: 53734

This is because mysql uses only one index for a simple query.

If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows (the most selective index).

However mysql can and will use two indexes since version 5.0 through Index merge optimization. Unfortunately it doesn't always happen and even when it does happen the results aren't all that great.

The first explain output shows that index merge optimization isn't being used in your query with the OR clause. It's only using the index on name_word_0

name_word_0 is great for WHERE name_word_0 = 'the' AND name_word_1 = 'glazier'; but as the output for the third explain shows, it's not at all suited for WHERE name_word_0 = 'bridgewaters' AND name_word_1 = '';

Therefore the combined query is really slow. YOu can overcome that by creating a composite index that spans name_word_0 and name_word_1. I noticed that your key len is really long. YOu can create a partial index and probably speed things up even further.

CREATE INDEX word01 ON staging_company_search (name_word_0(20), name_word_1(20))

Upvotes: 2

O. Jones
O. Jones

Reputation: 108651

Your use of an OR clause in the first query defeats the use of your index. MySQL is a little bit dumb that way. It's doing a full table scan: looking at every row. You're much better off using AND-only queries UNIONed together.

To make your AND-only queries even faster, create a compound index on (name_word_0, name_word_1, name_id). Your queries can be completely satisfied by a random access to that index, and should run subsecond with a two megarow table.

Upvotes: 2

Related Questions