Reputation: 2398
I am using mysql-workbench
and mysql
server in ubunt 18
machine with 16 GB RAM.
I have a schema named ips
, and two tables, say: table1
and table2
.
In table1
and table2
there are two fields: ip
and description
, bit are of type string. I have a lot of record. table1
has 779938 records and table2
has 136657 records.
I need to make a joint query to find the number of ip
s in table2
that has a description
starts with str1%
and does not contains str2
and does not contains str3
. In the same time, those ip
s has a description in table1
that does not start with str1%
, and contains either str2
or str3
.
This is my query:
SELECT COUNT(`table2`.`ip`)
FROM `ips`.`table2`, `ips`.`table1`
WHERE `table2`.`ip` = `table1`.`ip`
AND (LOWER(`table1`.`description`) NOT LIKE 'str1%'
AND (LOWER(`tabl1`.`description`) LIKE '%-str2-%'
OR LOWER(`table1`.`description`) LIKE '%-str3-%'
)
)
AND (LOWER(`table2`.`description`) LIKE 'str1%'
AND LOWER(`table2`.`description`) NOT LIKE '%-str2-%'
AND LOWER(`table2`.`description`) NOT LIKE '%-str3-%'
);
However, the query never ends. The duration has ?
and I never get result. Can you please help?
EDIT:
Here are the SHOW CREATE TABLE and
1) SHOW CREATE TABLE
ips.
table2;
CREATE TABLE `table2` (
`ip` varchar(500) DEFAULT NULL,
`description` varchar(500) DEFAULT NULL,
`type` varchar(500) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
2) SHOW CREATE TABLE
ips.
table1;
CREATE TABLE `table1` (
`ip` varchar(500) DEFAULT NULL,
`description` varchar(500) DEFAULT NULL,
`type` varchar(500) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
3) EXPLAIN <query>
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, table2, , ALL, , , , , 136109, 100.00, Using where
1, SIMPLE, table1, , ALL, , , , , 786072, 10.00, Using where; Using join buffer (Block Nested Loop)
EDIT 2:
The data for ip
field are string in this format: str.str.str.str
The decription
field is in this format: str1-str2-str3-str4
Upvotes: 0
Views: 263
Reputation: 2398
The previous answer regarding Indexing might optimise the query. It might be correct. But I am sorry that I have to check the answer I used to solve the problem. Thanks to @Raymond Nijland for being first to point the indexing issue which reminded me of the primary keys.
The source of the problem is that both tables in the query did not have primary key. The primary key must be for a key that is unique and not null. In my case I already have the ip
field ready to server as the primary key. Since I use mysql- workbench
I right click the tables, click Alter Table
then check the primary key for the approperiate field as follows:
That solved my problem.
Upvotes: 1
Reputation: 48850
You are getting the ALL
operator in the execution plan because the SQL planner is not using any index. It's performing a Full Table Scan on both tables.
A Full Table Scan can be optimal when you are selecting more than 5% of the rows. In your case this could be good if your string prefix "str1" had a single letter. If it has more than one character, then the usage on an index could greatly improve the performance.
Now, the comparisong you are performing is not a simple one. You are not comparing the value of a column, but the result of an expression: LOWER(table1.description)
. Therefore you need to create virtual columns and index them if you want this query to be fast. This is available on MySQL 5.7 and newer:
alter table table1 add lower_desc varchar(50)
generated always as (LOWER(description)) virtual;
create index ix1 on table1 (lower_desc);
alter table table2 add lower_desc varchar(50)
generated always as (LOWER(description)) virtual;
create index ix2 on table2 (lower_desc);
These indexes will make your queries faster when the prefix has two or more characters. Get the execution plan again. Now, the operators ALL
should not be there anymore (INDEX
operators should show up in their place now).
Incidentally, I think your missed a join in the query. I think it should look like (I added the third line):
SELECT COUNT(`table2`.`ip`)
FROM `ips`.`table2`
JOIN `ips`.`table1` on `ips`.`table1`.ip = `ips`.`table2`.ip
WHERE `table2`.`ip` = `table1`.`ip`
AND (LOWER(`table1`.`description`) NOT LIKE 'str1%'
AND (LOWER(`tabl1`.`description`) LIKE '%-str2-%'
OR LOWER(`table1`.`description`) LIKE '%-str3-%'
)
)
AND (LOWER(`table2`.`description`) LIKE 'str1%'
AND LOWER(`table2`.`description`) NOT LIKE '%-str2-%'
AND LOWER(`table2`.`description`) NOT LIKE '%-str3-%'
);
Also, to optimize the join performance you'll need one (or both) of the indexes shown below:
create index ix3 on table1 (ip);
create index ix4 on table2 (ip);
Upvotes: 0