Reputation: 309
Having terrible problems with the below query crippling performance of a script, with it often taking 10-30 seconds to complete. Wondering if anyone might have any optimization advice, specific or general is fine--I'm no querysmith.
Tinkering with types and indexes as well as the query itself is certainly doable.
SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1
LEFT JOIN TABLE_1 AS t2
ON t1.column_1 = t2.column_1
AND t1.column_3 = t2.column_3
AND t2.int_value = 1
AND t2.column_4 = 'test_string_1'
WHERE t1.column_5 = 'text_string_2';
Size of TABLE_1 ~ 6 million rows
Structure of TABLE_1:
+--------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| column_1 | bigint(12) | YES | MUL | NULL | |
| column_4 | varchar(100) | YES | MUL | NULL | |
| column_5 | varchar(140) | YES | | NULL | |
| column_2 | varchar(15) | YES | MUL | NULL | |
| int_value | int(1) | YES | MUL | NULL | |
| last_updated | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+--------------+--------------+------+-----+-------------------+-----------------------------+
Upvotes: 0
Views: 50
Reputation: 1269773
For this query, you want the correct indexes:
SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1 LEFT JOIN
TABLE_1 AS t2
ON t1.column_1 = t2.column_1 AND
t1.column_3 = t2.column_3 AND
t2.int_value = 1 AND
t2.column_4 = 'test_string_1'
WHERE t1.column_5 = 'text_string_2';
As written, this would be: TABLE_1(column_5, column_1, column3, column_2)
and TABLE_2(column_1, column_3, int_value, column_4)
.
However, I think the query can be greatly simplified. A LEFT JOIN
keeps all rows in the first table, regardless of whether or not the conditions match. The WHERE
condition is only on the first table and the columns are only from the first table, so the query should be equivalent to:
SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1
WHERE t1.column_5 = 'text_string_2';
It is possible that the DISTINCT
is not necessary. But the best index for this simplified version is TABLE_1(column_5, column_1, column_2)
.
Note: If you made a mistake in writing the query in the question, please ask another question instead of invalidating this answer.
Upvotes: 1
Reputation: 27
Get rid of the DISTINCT and try the HAVING statement: maybe this will be faster:
SELECT t1.column_1, t1.column_2
FROM TABLE_1 AS t1
LEFT JOIN TABLE_1 AS t2
ON t1.column_1 = t2.column_1
AND t1.column_3 = t2.column_3
HAVING t1.column_5 = 'text_string_2' AND t2.column_4 = 'test_string_1' AND t2.int_value = 1 ;
Upvotes: 0