Reputation: 1477
I would like to know if there is any option to improve and optimize my query. I am importing schedule data for our staff and need to delete the old data, when it is existing for this agent and day. The reason is, that agents may don't exist for everyday (because they left the company) and it could be, that we upload a more updated report as we did before (recent schedule changes).
That's why I currently have this query:
DELETE FROM `agents` WHERE
(`id` = 1 AND `date` => '01.01.2015 00:00:00' AND `date` <= '01.01.2015 23:59:59') OR
(`id` = 2 AND `date` => '01.01.2015 00:00:00' AND `date` <= '01.01.2015 23:59:59') OR [...]
This combination is for each agent of the report and each day in the report. I uploaded one which created 5780 day/agent combinations. This query took on my (currently) small table about 5 minutes to be executed.
I am wondering if anyone has an idea how I could improve this thing.
Upvotes: 0
Views: 141
Reputation: 24970
If you had a table like this:
create table t123
( id int not null,
date datetime not null,
myThing varchar(10) not null
);
And you later added an index like this:
ALTER TABLE t123 ADD INDEX (id,date); -- add an index after the fact
Then a delete from like yours on table t123
would perform as fast as I could imagine. However it would need to maintain the index along the way, and it is baggage to consider.
All index changes need to be carefully weighed. The benefit of quicker access, at the expense of slowing down inserts/updates/deletes.
Manual pages of Fast Engine Creation and Alter Table
Upvotes: 1
Reputation: 1271151
What you want to do is going to be rather difficult. As written, it probably requires a full table scan.
One method would be to add an index on agents(id, date)
and to do the deletes separately:
DELETE FROM `agents`
WHERE (`id` = 1 AND date >= '2015-01-01' AND `date` < '2015-01-02');
DELETE FROM `agents`
WHERE (`id` = 2 AND date >= '2015-01-01' AND `date` < '2015-01-02')
Assuming the dates are all the same, you can write the where
clause as:
DELETE FROM `agents`
WHERE `id` IN (1, 2, 3, . . . ) AND
`date` >= '2015-01-01' AND `date` < '2015-01-02';
Depending on the distribution of the data (number of dates per id in the range) weight, either the above index or one on agents(date, id)
would be best.
Upvotes: 2