usumoio
usumoio

Reputation: 3568

MySQL - Optimize Orphan Record Grooming

So here is my problem, I have written a stored procedure to do the following task. In table events there are events that might potentially exist for venues that no longer exist. Not all events are tied to a venue, but the ones that are have an integer value in their venue id field otherwise it is NULL (or potentially zero but that is accounted for). Periodically, venues get deleted from our system, when that happens it is not possible to delete all of the events associated with that venue at that exact time. Instead, a task is run periodically at a later time that deletes every event that has a venue id that no longer references an existing record in the venues table. I have written a stored procedure for this and it seems to work.

This is the stored procedure:

DROP PROCEDURE IF EXISTS delete_synced_events_orphans;
DELIMITER $$
CREATE PROCEDURE delete_synced_events_orphans()
BEGIN 

    DECLARE event_count int(11) DEFAULT 0;
    DECLARE active_event_id int(11) DEFAULT 0;
    DECLARE active_venue_id int(11) DEFAULT 0;
    DECLARE event_to_delete_id int(11) DEFAULT NULL;

    CREATE TEMPORARY TABLE IF NOT EXISTS possible_events_to_delete (
        event_id int(11) NOT NULL,
        venue_id_temp int(11) NOT NULL
    ) engine = memory;

    # create an "array" which is a table that holds the events that might need deleting
    INSERT INTO possible_events_to_delete (event_id, venue_id_temp) SELECT `events`.`id`, `events`.`venue_id` FROM `events` WHERE `events`.`venue_id` IS NOT NULL AND `events`.`venue_id` <> 0;
    SELECT COUNT(*) INTO `event_count` FROM `possible_events_to_delete` WHERE 1;

    detector_loop: WHILE `event_count` > 0 DO
        SELECT event_id INTO active_event_id FROM possible_events_to_delete WHERE 1 LIMIT 1;
        SELECT venue_id_temp INTO active_venue_id FROM possible_events_to_delete WHERE 1 LIMIT 1;

        # this figures out if there are events that need to be deleted
        SELECT `events`.`id` INTO event_to_delete_id FROM `events`, `venues` WHERE `events`.`venue_id` <> `venues`.`id` AND `events`.`id` = active_event_id AND `events`.`venue_id` = active_venue_id;

        #if no record meets that query, the active event is safe to delete
        IF (event_to_delete_id <> 0 AND event_to_delete_id IS NOT NULL) THEN
            DELETE FROM `events` WHERE `events`.`id` = event_to_delete_id;
            #INSERT INTO test_table (event_id_test, venue_id_temp_test) SELECT `events`.`id`, `events`.`venue_id` FROM `events` WHERE `events`.`id` =  event_to_delete_id;
        END IF;

        DELETE FROM possible_events_to_delete WHERE `event_id` = active_event_id AND `venue_id_temp` = active_venue_id;
        SET `event_count` = `event_count` - 1;

    END WHILE;

END $$
DELIMITER ;

Here is the table structure for the two tables in question:

CREATE TABLE IF NOT EXISTS events (
    id int(11) NOT NULL,
    event_time timestamp NOT NULL,
    venue_id_temp int(11) NOT NULL
);

CREATE TABLE IF NOT EXISTS venues (
    event_id int(11) NOT NULL,
    venue_id_temp int(11) NOT NULL
);

The stored procedure works as written, but I want to know about ways that it could be made to run better. It seems like its doing a lot of extra processing to achieve its goal. Are there better ways I could query the data at hand, or are there other more useful commands and key words I could use that I just don't know about, which would allow me to complete this task better (fewer lines less computation). I am still learning how to use stored procedures, so I am using them to complete tasks as pragmatically as possible, I want to understand how this specific query could be made to better use the full range of features in MySQL to its advantage. Thank you folks.

Upvotes: 0

Views: 1266

Answers (1)

Olexa
Olexa

Reputation: 587

Everithing is much simpler:

DROP PROCEDURE IF EXISTS delete_synced_events_orphans;
DELIMITER $$
CREATE PROCEDURE delete_synced_events_orphans()
BEGIN 

  DELETE
    FROM `events`
    WHERE `venue_id` IS NOT NULL AND `venue_id` <> 0
      AND `venue_id` NOT IN (SELECT `id` FROM `venues`)
  ;

END $$
DELIMITER ;

That's it. :)

You think imperatively, trying to say MySQL how to complete your task. But SQL is a declarative language, designed for saying what to do.

Upvotes: 2

Related Questions