Vit Kos
Vit Kos

Reputation: 5755

How to speed up MySQL query that is a carthesian

Suppose having a table named contact with next structure:

id INT -- primary key,autoincrement,index 
firstname VARCHAR (255),
lastname VARCHAR(255),
type ENUM

Executing such query:

SELECT c1.id AS c1_id, c2.id AS c2_id
FROM contact c1
INNER JOIN contact c2 ON c1.firstname = c2.firstname AND c1.lastname = c2.lastname
WHERE c1.id <> c2.id AND c1.type=c2.type

is ok on a small ammount of records... but when the record count grows from 30-40 to 1000 this query is very slow.Need to abstract from the record count and speed up this query maximum as possible.Any suggestions?

Upvotes: 1

Views: 75

Answers (3)

Strawberry
Strawberry

Reputation: 33945

Except for the "<>", your query's fine (In fact, it may even be fractionally faster than the alternatives put forward!). It's just your indexes that need some work...

 CREATE TABLE contact
 (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
 ,firstname VARCHAR(255) NOT NULL
 ,lastname VARCHAR(255) NOT NULL
 ,type ENUM('small','medium','large') NOT NULL
 );

 INSERT INTO contact VALUES
 (NULL,'John','Brown','small'),
 (NULL,'Bill','Red','small'),
 (NULL,'Paul','Orange','medium'),
 (NULL,'Mike','Green','large'),
 (NULL,'John','Scarlet','small'),
 (NULL,'John','Cyan','medium'),
 (NULL,'Fiona','Brown','large'),
 (NULL,'John','Brown','small'),
 (NULL,'Chris','Copper','medium'),
 (NULL,'Steve','Silver','large');

 INSERT INTO contact SELECT NULL,x.firstname, y.lastname, z.type FROM contact x, contact y, contact z;

SELECT COUNT(*) FROM contact;
+----------+
| COUNT(*) |
+----------+
|     1010 |
+----------+
1 row in set (0.01 sec) 

 SELECT c1.id c1_id
      , c2.id c2_id
   FROM contact c1
   JOIN contact c2 
     ON c1.firstname = c2.firstname 
    AND c1.lastname = c2.lastname
    AND c1.type=c2.type
  WHERE c1.id < c2.id; 

  ...
  ...
  |  1006 |  1008 |
  +-------+-------+
  5634 rows in set (0.16 sec)

So, now let's add an index on (firstname,lastname,type)...

 DROP TABLE IF EXISTS contact;

 CREATE TABLE contact
 (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
 ,firstname VARCHAR(255) NOT NULL
 ,lastname VARCHAR(255) NOT NULL
 ,type ENUM('small','medium','large') NOT NULL
 ,INDEX(firstname,lastname,type)
 );

 INSERT INTO contact VALUES
 (NULL,'John','Brown','small'),
 (NULL,'Bill','Red','small'),
 (NULL,'Paul','Orange','medium'),
 (NULL,'Mike','Green','large'),
 (NULL,'John','Scarlet','small'),
 (NULL,'John','Cyan','medium'),
 (NULL,'Fiona','Brown','large'),
 (NULL,'John','Brown','small'),
 (NULL,'Chris','Copper','medium'),
 (NULL,'Steve','Silver','large');

 INSERT INTO contact SELECT NULL,x.firstname, y.lastname, z.type FROM contact x, contact y, contact z;

 SELECT c1.id c1_id
      , c2.id c2_id
   FROM contact c1
   JOIN contact c2 
     ON c1.firstname = c2.firstname 
    AND c1.lastname = c2.lastname
    AND c1.type = c2.type
    AND c1.id < c2.id; 

  |   775 |   776 |
  ...
  |  1006 |  1008 |
  +-------+-------+
  5634 rows in set (0.05 sec)

Upvotes: 1

fthiella
fthiella

Reputation: 49089

You can try with something like this:

SELECT GROUP_CONCAT(DISTINCT id), firstname, lastname
FROM contact
GROUP BY firstname, lastname
HAVING COUNT(DISTINCT id)>1

this will return all duplicated names. If you want the IDs you could then use a JOIN:

SELECT
  contact.id
FROM
  contact INNER JOIN (SELECT firstname, lastname
                      FROM contact
                      GROUP BY firstname, lastname
                      HAVING COUNT(DISTINCT id)>1) dup
  ON contact.firstname=dup.firstname AND contact.lastname=dup.lastname

Upvotes: 4

Damien Overeem
Damien Overeem

Reputation: 4529

Small deviation from fthiella's answer (just incase this is what you need):

SELECT group_concat(id) as ids, firstname, lastname
FROM contact
GROUP BY firstname, lastname

The query above will fill column ids with a comma seperated list of id's for each firstname+lastname combo.

Upvotes: 1

Related Questions