tkooser
tkooser

Reputation: 125

MySQL Delete with COUNT in INNER JOIN not deleting all records

I'm trying to delete records from a table if there are more than 15 from the same company (as in, if there are more than 15 records from a company, all of that companies records are deleted).

The query I'm using is as follows:

DELETE l.* FROM literature_request l 
INNER JOIN 
    (SELECT literature_request_id 
     FROM literature_request 
     GROUP BY company 
     HAVING COUNT(*) > 15) lr 
ON l.literature_request_id = lr.literature_request_id;

This should delete somewhere around 25,000 rows, but strangely only deletes around 500 each time I run it:

mysql> DELETE l.* FROM literature_request l INNER JOIN (SELECT literature_request_id FROM literature_request GROUP BY company HAVING COUNT(*) > 15) lr ON l.literature_request_id = lr.literature_request_id;
Query OK, 564 rows affected (0.39 sec)

mysql> DELETE l.* FROM literature_request l INNER JOIN (SELECT literature_request_id FROM literature_request GROUP BY company HAVING COUNT(*) > 15) lr ON l.literature_request_id = lr.literature_request_id;
Query OK, 547 rows affected (2.24 sec)

mysql> DELETE l.* FROM literature_request l INNER JOIN (SELECT literature_request_id FROM literature_request GROUP BY company HAVING COUNT(*) > 15 LIMIT 0,30000) lr ON l.literature_request_id = lr.literature_request_id;
Query OK, 533 rows affected (1.27 sec)

mysql> DELETE l.* FROM literature_request l INNER JOIN (SELECT literature_request_id FROM literature_request GROUP BY company HAVING COUNT(*) > 15 LIMIT 0,30000) lr ON l.literature_request_id = lr.literature_request_id;
Query OK, 523 rows affected (0.43 sec)

I'm not adding new rows in between runs, so I'm wondering why it deletes ~500 each time I run it. Shouldn't it delete all of them the first time, and then not affect any rows with subsequent runs?

Is there an error on my part here, or is there a better way to accomplish what I'm trying to do?

Upvotes: 2

Views: 1427

Answers (1)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115530

If I understand correctly, you want to join on the company:

DELETE l FROM literature_request l 
INNER JOIN 
    (SELECT company 
     FROM literature_request 
     GROUP BY company 
     HAVING COUNT(*) > 15) lr 
ON l.company = lr.company 

The reason behind the weird behaviour is that in the subquery, you were grouping by company but you were selecting the literature_request_id. But which one - of the 15 or more - should the SQL engine be selecting? It was choosing one (more or less randomly), so only that row (1 row from every company) was deleted.

Upvotes: 2

Related Questions