Arthur Kushman
Arthur Kushman

Reputation: 3609

Optimize sql statement with OR in where clause

running this query on 5Gb sized messages table. The problem is that the execution takes > 3 minutes.

SELECT m.id FROM messages m 
          LEFT JOIN dialog d on m.id=d.mid 
          WHERE (SELECT count(*) 
          FROM dialog 
          WHERE (m.from_id=uid1 and m.user_id=uid2) 
          OR (m.from_id=uid2 and m.user_id=uid1))=0 && read_state=0             
          LIMIT 100            

I understand, that this is a bad practice to search by NESTED SELECT IN WHERE CLAUSE, but yet didn`t find another way to select such rows. Tried to split OR to 2 UNION statements, but it was long either.

messages table structure:

+------------+-------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+-------------+------+-----+---------+----------------+
| id         | int(11)     | NO   | PRI | NULL    | auto_increment |
| from_id    | int(11)     | NO   | MUL | NULL    |                |
| user_id    | int(11)     | NO   | MUL | NULL    |                |
| group_id   | int(11)     | NO   |     | NULL    |                |
| to_number  | varchar(30) | NO   | MUL | NULL    |                |
| msg        | text        | NO   |     | NULL    |                |
| image      | varchar(20) | NO   |     | NULL    |                |
| date       | bigint(20)  | NO   |     | NULL    |                |
| read_state | tinyint(1)  | NO   |     | 0       |                |
| removed    | tinyint(1)  | NO   |     | NULL    |                |
+------------+-------------+------+-----+---------+----------------+

dialog table structure

+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| id        | int(11)          | NO   | PRI | NULL    | auto_increment |
| uid1      | int(11)          | NO   | MUL | NULL    |                |
| uid2      | int(11)          | NO   | MUL | NULL    |                |
| mid       | int(11)          | NO   |     | NULL    |                |
| anonym_id | int(10) unsigned | NO   |     | NULL    |                |
+-----------+------------------+------+-----+---------+----------------+

mysql> show index from messages;

+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| messages |          0 | PRIMARY   |            1 | id          | A         |    12560908 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | to_number |            1 | to_number   | A         |      161037 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | from_id   |            1 | from_id     | A         |      157011 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | from_id   |            2 | to_number   | A         |      169742 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2 |            1 | user_id     | A         |      314022 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2 |            2 | read_state  | A         |      380633 |     NULL | NULL   |      | BTREE      |         |               |
| messages |          1 | user_id_2 |            3 | removed     | A         |      392528 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

mysql> show index from dialog;

+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| dialog |          0 | PRIMARY  |            1 | id          | A         |     3125615 |     NULL | NULL   |      | BTREE      |         |               |
| dialog |          1 | uid1     |            1 | uid1        | A         |      520935 |     NULL | NULL   |      | BTREE      |         |               |
| dialog |          1 | uid1     |            2 | uid2        | A         |     3125615 |     NULL | NULL   |      | BTREE      |         |               |
| dialog |          1 | uid2     |            1 | uid2        | A         |     1562807 |     NULL | NULL   |      | BTREE      |         |               |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

EXPLAIN EXTENDED

+----+--------------------+--------+-------+---------------+------+---------+------+----------+----------+--------------------------+
| id | select_type        | table  | type  | possible_keys | key  | key_len | ref  | rows     | filtered | Extra                    |
+----+--------------------+--------+-------+---------------+------+---------+------+----------+----------+--------------------------+
|  1 | PRIMARY            | m      | ALL   | NULL          | NULL | NULL    | NULL | 22190398 |   100.00 | Using where              |
|  1 | PRIMARY            | d      | ALL   | NULL          | NULL | NULL    | NULL |  3125621 |   100.00 |                          |
|  2 | DEPENDENT SUBQUERY | dialog | index | uid1,uid2     | uid1 | 8       | NULL |  3125621 |   100.00 | Using where; Using index |
+----+--------------------+--------+-------+---------------+------+---------+------+----------+----------+--------------------------+

Upvotes: 3

Views: 102

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269445

The first thought is to change the subquery query to use not exists instead of count(*). The second is to split this into two separate subqueries. The third is to add indexes:

create index idx_messages_read_state_4 on messages_read_state(user_id, from_id, user_id, id);
create index idx_dialog_2 on dialog(uid1, uid2)

And the fourth is to remove the left join to dialog in the outer query. No fields from dialog are being used and the left join means it is not being used for filtering.

The query is then:

select m.id
from messages m
where m.read_state = 0 and
      not exists (select 1
                  from dialog d
                  where m.from_id = d.uid1 and m.user_id = d.uid2
                 ) and
      not exists (select 1
                  from dialog d
                  where m.from_id = d.uid2 and m.user_id = d.uid1
                 )
limit 100;

Upvotes: 4

Related Questions