Reputation: 1537
TLDR
I started deleting rows from a massive InnoDB table about 8 hours ago, when I checked 45 mins ago about 80M rows were deleted, when i just checked a few minutes ago, the count was back to when I started, but rows are still being deleted - why?
The issue. My team inherited an app a few weeks ago, a reliable internal app (so not a showstopper) that updates reports from an internal data source. The initial developer did a good job, with one exception - he did not clean up old records in a particular table when new data was available, so over time the DB table grew to 500M records (each bi-weekly update adds a few million rows to this table).
We decided to clean up the table (MySql InnoDb) and add some logic to the update script to prevent this from happening again.
First step: I ran a simple delete command that is taking hours, I kicked off this afternoon and was going to let it run all night. My SQL is essentially:
delete from the_enormous_table where batch_id != most_recent_batch_id
About 45 mins ago the script had deleted about 80M records, but when I checked again a few mins ago, the row count seemed to have reset to where I had started yet continues to delete.
I am not a DBA but obviously have done things like this before - never with such a massive dataset. I think I have never had to deal with the nuances of InnoDB locking etc - or I have forgotten them over the years.
What is the issue here, do I need to break this down to smaller delete sets?
If that's the case, I may just recreate the table with the latest data set and then get rid of the old table.
Upvotes: 1
Views: 104
Reputation: 142453
For big DELETEs
, there are better ways.
Deleting the entire table -- TRUNCATE
.
Replacing the entire table -- Build a new table, then use RENAME
.
Deleting a big part of a table -- Walk through the table using the PRIMARY KEY
to look at chunks of 1000 rows at a time.
Regular deletion of "old" rows -- PARTITION BY RANGE
.
Details: http://mysql.rjweb.org/doc.php/deletebig
(Karwin's Comment explained the title problem -- row counts bouncing around for InnoDB.)
Upvotes: 1
Reputation: 74690
TRUNCATE the table at the start of every run; it's a faster way to jettison large amounts of data. You can issue a truncate right now, or drop the table and make it anew if you like, to solve your immediate problem that you need to remove 500M records
Upvotes: 1