user9371654
user9371654

Reputation: 2398

Simple SQL query lasts forever

I am using mysql-workbench and mysql server in ubunt 18 machine with 16 GB RAM.

I have a schema named ips, and two tables, say: table1 and table2. In table1 and table2 there are two fields: ip and description, bit are of type string. I have a lot of record. table1 has 779938 records and table2 has 136657 records.

I need to make a joint query to find the number of ips in table2 that has a description starts with str1% and does not contains str2 and does not contains str3. In the same time, those ips has a description in table1 that does not start with str1%, and contains either str2 or str3.

This is my query:

SELECT COUNT(`table2`.`ip`)
FROM `ips`.`table2`, `ips`.`table1`
WHERE `table2`.`ip` = `table1`.`ip`
       AND (LOWER(`table1`.`description`) NOT LIKE 'str1%' 
             AND (LOWER(`tabl1`.`description`) LIKE '%-str2-%' 
                   OR LOWER(`table1`.`description`) LIKE '%-str3-%'
                 )
            )
       AND (LOWER(`table2`.`description`) LIKE 'str1%'
            AND LOWER(`table2`.`description`) NOT LIKE '%-str2-%' 
            AND LOWER(`table2`.`description`) NOT LIKE '%-str3-%'
           );

However, the query never ends. The duration has ? and I never get result. Can you please help?

EDIT:

Here are the SHOW CREATE TABLE and

1) SHOW CREATE TABLEips.table2;

CREATE TABLE `table2` (
  `ip` varchar(500) DEFAULT NULL,
  `description` varchar(500) DEFAULT NULL,
  `type` varchar(500) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

2) SHOW CREATE TABLEips.table1;

CREATE TABLE `table1` (
  `ip` varchar(500) DEFAULT NULL,
  `description` varchar(500) DEFAULT NULL,
  `type` varchar(500) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

3) EXPLAIN <query>

# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, table2, , ALL, , , , , 136109, 100.00, Using where
1, SIMPLE, table1, , ALL, , , , , 786072, 10.00, Using where; Using join buffer (Block Nested Loop)

EDIT 2:

The data for ip field are string in this format: str.str.str.str The decription field is in this format: str1-str2-str3-str4

Upvotes: 0

Views: 263

Answers (2)

user9371654
user9371654

Reputation: 2398

The previous answer regarding Indexing might optimise the query. It might be correct. But I am sorry that I have to check the answer I used to solve the problem. Thanks to @Raymond Nijland for being first to point the indexing issue which reminded me of the primary keys.

The source of the problem is that both tables in the query did not have primary key. The primary key must be for a key that is unique and not null. In my case I already have the ip field ready to server as the primary key. Since I use mysql- workbench I right click the tables, click Alter Table then check the primary key for the approperiate field as follows:

enter image description here

That solved my problem.

Upvotes: 1

The Impaler
The Impaler

Reputation: 48850

You are getting the ALL operator in the execution plan because the SQL planner is not using any index. It's performing a Full Table Scan on both tables.

A Full Table Scan can be optimal when you are selecting more than 5% of the rows. In your case this could be good if your string prefix "str1" had a single letter. If it has more than one character, then the usage on an index could greatly improve the performance.

Now, the comparisong you are performing is not a simple one. You are not comparing the value of a column, but the result of an expression: LOWER(table1.description). Therefore you need to create virtual columns and index them if you want this query to be fast. This is available on MySQL 5.7 and newer:

alter table table1 add lower_desc varchar(50) 
  generated always as (LOWER(description)) virtual;
create index ix1 on table1 (lower_desc);

alter table table2 add lower_desc varchar(50) 
  generated always as (LOWER(description)) virtual;
create index ix2 on table2 (lower_desc);

These indexes will make your queries faster when the prefix has two or more characters. Get the execution plan again. Now, the operators ALL should not be there anymore (INDEX operators should show up in their place now).

Incidentally, I think your missed a join in the query. I think it should look like (I added the third line):

SELECT COUNT(`table2`.`ip`)
FROM `ips`.`table2`
JOIN `ips`.`table1` on `ips`.`table1`.ip = `ips`.`table2`.ip
WHERE `table2`.`ip` = `table1`.`ip`
       AND (LOWER(`table1`.`description`) NOT LIKE 'str1%' 
             AND (LOWER(`tabl1`.`description`) LIKE '%-str2-%' 
                   OR LOWER(`table1`.`description`) LIKE '%-str3-%'
                 )
            )
       AND (LOWER(`table2`.`description`) LIKE 'str1%'
            AND LOWER(`table2`.`description`) NOT LIKE '%-str2-%' 
            AND LOWER(`table2`.`description`) NOT LIKE '%-str3-%'
           );

Also, to optimize the join performance you'll need one (or both) of the indexes shown below:

create index ix3 on table1 (ip);
create index ix4 on table2 (ip);

Upvotes: 0

Related Questions