Reputation: 15
Each week a cronjob fires that deletes about 500,000 records from a MySQL table which uses the InnoDB engine. The table contains web log data for weekly processing and afterwards data which is no longer needed is deleted. There are 3 indexes on this table (no foreign keys), including time which is a unix timestamp.
DELETE FROM global_viewlog WHERE time<1354391592
The problem is that this query when ran last week took over 2 hours to run and during that time my entire server was lagging. 'iotop' revealed the hard disks where being written to by MySQL quote vastly and the server load dramatically increased to unhealthy levels.
The table gets written to every click. Is there an elephant in the room that I'm not seeing? I.e., something blindly obvious to optimize this weekly query?
Thanks
Upvotes: 0
Views: 2644
Reputation: 108641
You're using InnoDB so there's an implicit transaction. You're deleting a whole mess of records, so the transaction is big. @Erik Ekman is right, if you can get partitioning to work it's a good way to handle this.
But here's another good way. Try doing your delete in smaller batches, like so.
DELETE
FROM global_viewlog
WHERE time<135439159
LIMIT 1000
Keep issuing this statement until nothing is left to delete. Most client libraries (JDBC, etc) return the number of records affected. Or you can just issue the statement a thousand times from a script if you want to do this the cheap way.
(You can fiddle with the number of records in the LIMIT
statement.) It may or may not take more elapsed time than your single delete statement, but it won't tie up your server as much.
Try changing this table's access method to MyISAM and using this:
DELETE LOW_PRIORITY QUICK
FROM global_viewlog
WHERE time<135439159
This will keep your cleanup operation from interfering with production.
Change your cronjob's frequency from once a week to once a day. This will reduce the size of your delete batches so you don't have such a whopping mess to clean up.
Upvotes: 2
Reputation: 2066
If you regularly delete old data based on time, partitioning the table based on the same column will make it much faster. It will separate different time periods to different areas on disk so a delete will just be to drop an entire chunk instead of looking at individual rows.
http://dev.mysql.com/doc/refman/5.5/en/partitioning.html
Upvotes: 5