rahularyansharma
rahularyansharma

Reputation: 10755

MYSQL delete from large table increase the delete time for each transaction

I am trying to delete data from table which contains almost 6,000,000,000 records , with where clause.

here is the stored procedure I am using and running from command prompt MySQL in windows.

DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `clean_table`( )
BEGIN   
    REPEAT
        DO SLEEP(1);
        DELETE FROM tablename
        WHERE exportInfoId<=8479  limit 100000;   
        SELECT ROW_COUNT();
    UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;

Its deleting the data but it continue incasing the time for each delete transaction. Why its keep increasing , even when with delete the earlier transaction reduce the data in table ? Is there any way to make it same time for each transaction ? I am already using sleep as suggested in some other answers .

enter image description here

Upvotes: 1

Views: 1204

Answers (2)

CodeTheorist
CodeTheorist

Reputation: 1672

You need to add an index to the column with which you are using to find the record(s) to be deleted.

With an index, MySQL knows exactly where the records are to be found so it can go straight to the record(s).

Without an index, then the table must be searched row by row.

The difference is, without an index the deletes are performed in the order of the primary key however, with an index the records will be searched in the order of that particular column.

To add an index, do what @o-jones points out below in the comments, which is:

// Normal Index
ALTER TABLE ADD INDEX exportInfoId (exportInfoId);

// Reverse Index
ALTER TABLE ADD INDEX exportInfoId (exportInfoId DESC);

Adding an index to the column is the correct answer here however, there may be other answers that work for you, depending on your use-case.

Upvotes: 2

Rick James
Rick James

Reputation: 142258

Sure, adding an index would help a lot. ("DESC" is optional.)

Sure it will take time to add an index to 6B rows. Newer versions of MySQL can do it "instantly", but that just means it is happening in the background and is not available for a long time.

The reason for the slowdown is that the DELETEs are skipping over more and more rows that are not to be deleted. The INDEX would avoid that. (As Akina described.)

A note about DELETE (and UPDATE): It is slow because it keeps the deleted rows in case of a ROLLBACK (which would happen after a power failure).

There are other techniques. See http://mysql.rjweb.org/doc.php/deletebig . The main one that comes to mind for your task is to walk through the table 1K rows at a time based on the PRIMARY KEY. (Note that this avoids repeatedly skipping over rejected rows.)

If "most" of the table is to be deleted, copying the table over is the best way. (See that link.) Note also that the table will not shrink after DELETEs; the freed up space will be reused, to some extent. (As Bill Commented on.)

If you are deleting "old" data such that PARTITIONing would work, then DROP PARTITION is virtually instantaneous on all versions (since 5.1?) of MySQL. (That is also mentioned in my link.) (Partitioning is usually used for "time-series"; does exportInfoId work like a "time"?)

To Alex, I say "there may be other answers".

As for 100K vs 5K rows; I like 1K because above that, you get into "diminishing returns". And at 100K you might hit various buffer size settings that actually cause 100K to run slower.

Upvotes: 1

Related Questions