Reputation: 507
I have a few million database rows in MySQL InnoDB database which need to get cleaned up. Look at this example.
SELECT archiveid, clearingid, pickdate
FROM tblclearingarchive
WHERE clearingid = 30978729
ORDER BY pickdate;
+-----------+------------+---------------------+
| archiveid | clearingid | pickdate |
+-----------+------------+---------------------+
| 34328367 | 30978729 | NULL | *
| 34333844 | 30978729 | 2015-10-27 15:55:30 | <- keep only this row with oldest date
| 34438038 | 30978729 | 2016-03-01 10:34:25 | *
| 34481472 | 30978729 | 2016-04-20 13:44:19 | *
+-----------+------------+---------------------+
4 rows in set (0.01 sec)
So I know the clearingid value(s) of the affected field(s) and want to remove the one with no pickdate (null) and the two lines which are redundant (later, after first pick). In the example above, the ones marked with * should get deleted.
Any hints about how such SQL update/delete might look like?
There are about 30M rows and about 250K rows (known clearingid's) to clean up.
With the initial idea of Matias Barrios I found this solution to verify. It perfectly lists the rows I want to delete:
SELECT archiveid, clearingid, pickdate
FROM tblclearingarchive
WHERE clearingid = 30978729
AND (pickdate NOT IN (SELECT MIN(pickdate)
FROM tblclearingarchive
WHERE clearingid = 30978729 )
OR pickdate is NULL)
ORDER BY pickdate;
+-----------+------------+---------------------+
| archiveid | clearingid | pickdate |
+-----------+------------+---------------------+
| 34328367 | 30978729 | NULL |
| 34438038 | 30978729 | 2016-03-01 10:34:25 |
| 34481472 | 30978729 | 2016-04-20 13:44:19 |
+-----------+------------+---------------------+
3 rows in set (0.20 sec)
But I fail to delete using this sort of query:
DELETE FROM tblclearingarchive
WHERE clearingid = 30978729
AND (pickdate NOT IN (SELECT MIN(pickdate)
FROM tblclearingarchive
WHERE clearingid = 30978729 )
OR pickdate is NULL);
ERROR 1093 (HY000): You can't specify target table 'tblclearingarchive' for update in FROM clause
Upvotes: 2
Views: 877
Reputation: 5056
This should do as you intend.
DELETE FROM tblclearingarchive
WHERE archiveid IN (
SELECT * FROM ( SELECT archiveid, clearingid, pickdate
FROM tblclearingarchive
WHERE clearingid = 30978729
AND pickdate NOT IN ( SELECT max(pickdate)
FROM tblclearingarchive LIMIT 1)
ORDER BY pickdate) tmpTable
) OR pickdate IS NULL
Let me know if it works.
Upvotes: 0
Reputation: 9083
You will need to do this:
Create "assistance table" so you can do the delete
create table test as
select * from tblclearingarchive
Then do the delete:
delete from tblclearingarchive t1
where t1.pickdate <> (select min(t.pickdate)
from test t
group by clearingid)
or t1.pickdate is null;
Upvotes: 1
Reputation: 1269503
If you want the first, then you can filter:
select t.*
from t
where t.pickdate = (select min(t2.pickdate)
from t t2
where t2.clearingid = t.clearingid
);
Or, if you just want one row:
select t.*
from t
where t.pickdate is not null
order by t.pickdate
limit 1;
EDIT:
If you want to actually modify the table:
delete t
from t join
(select clearingid, min(pickdate) as min_pickdate
from t
group by clearingid
) c
on t.clearingid = c.clearingid
where t.pickdate > c.min_pickdate or t.pickdate is null;
Upvotes: 0