Michal M
Michal M

Reputation: 9480

Calling stored procedure sequentially from .sql file

I'm stuck here.

I've got a Procedure that I want to run X* times in a row. (*X is couple of thousands times)
The procedure based on input data does this:
1. Looks for an actions.id, if not found LEAVEs.
2. Looks for users.id, if not found, creates one and uses LAST_INSERT_ID();
3-5. Looks for summaries.id (3 types, total, daily and monthly), if not found, creates one and uses it's ID.
6. Once all required ids are collected, INSERTs new row into actions and either updates the summaries rows in a transaction, so if any fails - it does a ROLLBACK - no harm done.
7. Depending on the outcome SELECTs message.

CREATE PROCEDURE NEW_ACTION(
  IN a_date TIMESTAMP,
  IN u_name VARCHAR(255),
  IN a_name VARCHAR(255),
  IN a_chars INT,
  IN url VARCHAR(255),
  IN ip VARCHAR(15))

  lbl_proc: BEGIN
    DECLARE a_id, u_id, us_id, usd_id, usm_id, a_day, a_month, error INT;
    DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' SET error = 1;

    SET error = 0;
    SET a_day = DATE_FORMAT(SUBSTRING(a_date ,1,10), '%Y%m%d');
    SET a_month = SUBSTRING(a_day, 1, 6);

    /* 1. RETREIVING action.id */
    SET a_id = (SELECT `id` FROM `actions` WHERE `name` = a_name);
    IF a_id IS NULL THEN
      SELECT 'error';
      LEAVE lbl_proc;
    END IF;

    /* 2. RETREIVING users.id */
    SET u_id = (SELECT `id` FROM `users` WHERE `name` = u_name);
    IF u_id IS NULL THEN
      INSERT INTO `users` (name) VALUES (u_name);
      SET u_id = (SELECT LAST_INSERT_ID());
    END IF;

    /* 3. RETREIVING user_summaries.id */
    SET us_id = (SELECT `id` FROM `users_summaries` WHERE `user_id` = u_id AND `action_id` = a_id);
    IF us_id IS NULL THEN
      INSERT INTO `users_summaries` (user_id, action_id) VALUES (u_id, a_id);
      SET us_id = (SELECT LAST_INSERT_ID());
    END IF;

    /* 4. RETREIVING user_summaries_days.id */
    SET usd_id = (SELECT `id` FROM `users_summaries_days` WHERE `day` = a_day AND `user_id` = u_id AND `action_id` = a_id);
    IF usd_id IS NULL THEN
      INSERT INTO `users_summaries_days` (day, user_id, action_id) VALUES (a_day, u_id, a_id);
      SET usd_id = (SELECT LAST_INSERT_ID());
    END IF;

    /* 5. RETREIVING user_summaries_months.id */
    SET usm_id = (SELECT `id` FROM `users_summaries_months` WHERE `month` = a_month AND `user_id` = u_id AND `action_id` = a_id);
    IF usm_id IS NULL THEN
      INSERT INTO `users_summaries_months` (month, user_id, action_id) VALUES (a_month, u_id, a_id);
      SET usm_id = (SELECT LAST_INSERT_ID());
    END IF;

    /* 6. SAVING action AND UPDATING summaries */
    SET autocommit = 0;
    START TRANSACTION;
      INSERT INTO `users_actions` (`date`, `user_id`, `action_id`, `chars`, `url`, `ip`) VALUES (a_date, u_id, a_id, a_chars, url, ip);
      UPDATE `users_summaries` SET qty = qty + 1, chars = chars + a_chars WHERE id = us_id;
      UPDATE `users_summaries_days` SET qty = qty + 1, chars = chars + a_chars WHERE id = usd_id;
      UPDATE `users_summaries_months` SET qty = qty + 1, chars = chars + a_chars WHERE id = usm_id;

      IF error = 1 THEN
        SELECT 'error';
        ROLLBACK;
        LEAVE lbl_proc;
      ELSE
        SELECT 'success';
        COMMIT;
      END IF;
  END;

Now, I've got raw data that I want to feed into this procedure. There's currently about 3000 rows.

I tried all the solutions I knew:

A. # mysql -uuser -ppass DB < calls.sql - Using php I've basically created a list of calls like this:

CALL NEW_ACTION('2010-11-01 13:23:00', 'username1', 'actionname1', '100', 'http://example.com/', '0.0.0.0');  
CALL NEW_ACTION('2010-11-01 13:23:00', 'username2', 'actionname1', '100', 'http://example.com/', '0.0.0.0');  
CALL NEW_ACTION('2010-11-01 13:23:00', 'username1', 'actionname2', '100', 'http://example.com/', '0.0.0.0');  
...

This fails always (tried few times) at row 452 where it found two summary IDs (step 3).
I thought this could be due to the fact that earlier (rows 375-376) there are calls for the same user for the same action.
As if mysql didn't update tables in time, so the summary row created in CALL from line 375 isn't yet visible when line 376 gets executed - therefore creating another summary line.
Tought I'd try delaying calls...

B. Using mysql's SLEEP(duration).
This didn't change anything. Execution stops at the very same CALL again.

I'm out of ideas now.
Suggestions and help hugely appreciated.

NOTE: action names and user names repeat.

PS. Bear in mind this is one of my first procedures ever written.
PS2. Running mysql 5.1.52-community-log 64bit (Windows 7U), PHP 5.3.2 and Apache 2.2.17


EDIT

I've removed PHP related part of question to a separate question here.


EDIT2

Ok, I've deleted the first 200 calls from the .sql file. For some reason it went fine past the previous line that was stopping execution. Now it stopped at row 1618.
This would mean, that at one point a newly INSERTed summary row is no visible for a moment, therefore when it happens that one of the following iterations want to SELECT it, it's not yet accessible for them. Is that a MySQL bug?


EDIT3

Now there's another interesting thing I noticed. I investigated where two users_summaries get created. This happens (not always, but if, then it is) when there are two CALLs referring to the same user and action in close proximity. They could be next to each other or separated by 1 or 2 different calls.

If I move one of them (within .sql file) like 50-100 rows lower (executed earlier) than it's fine. I even managed to make the .sql file work as a whole. But this still doesn't really solve the problem. With 3000 rows it's not that bad, but if I had 100000, I'm lost. I can't rely on manual tweaks to .sql file.

Upvotes: 1

Views: 2772

Answers (1)

Michal M
Michal M

Reputation: 9480

This isn't really a solution, but a workaround.

Just to clarify, summary tables had id column as PRIMARY KEY with AUTO_INCREMENT option and indexes on both user_id and action_id column.

My investigation showed that although my procedure was looking for an entry that existed using WHERE user_id = u_id AND action_id = a_id in certain situations it didn't find it causing new row being inserted with the same user_id and action_id values - something I did not want.

Debugging the procedure showed that the summary row I was looking for, although not accessible with WHERE user_id = u_id AND action_id = a_id condition, was properly returned when calling it's id - PRIMARY KEY.
With this find I decided to change format of id column, from UNASIGNED INT with AUTO_INCEREMENT to a CHAR(32) which consisted of:

<user_id>|<action_id>

This meant that I knew exactly what the id of the row I wanted is even before it existed. This solved the problem really. It also enabled me to use INSERT ... ON DUPLICATE KEY UPDATE ... construct.

Below my updated procedure:

CREATE PROCEDURE `NEW_ACTION`(
  IN a_date TIMESTAMP,
  IN u_name VARCHAR(255),
  IN a_name VARCHAR(255),
  IN a_chars INT,
  IN url VARCHAR(255),
  IN ip VARCHAR(15))
  SQL SECURITY INVOKER

lbl_proc: BEGIN
    DECLARE a_id, u_id, a_day, a_month, error INT;
    DECLARE us_id, usd_id, usm_id CHAR(48);
    DECLARE sep CHAR(1);
    DECLARE CONTINUE HANDLER FOR SQLSTATE '23000' SET error = 1;

    SET sep = '|';
    SET error = 0;
    SET a_day = DATE_FORMAT(SUBSTRING(a_date ,1,10), '%Y%m%d');
    SET a_month = SUBSTRING(a_day, 1, 6);

    /* RETREIVING action.id */
    SET a_id = (SELECT `id` FROM `game_actions` WHERE `name` = a_name);
    IF a_id IS NULL THEN
      SELECT 'error';
      LEAVE lbl_proc;
    END IF;

    /* RETREIVING users.id */
    SET u_id = (SELECT `id` FROM `game_users` WHERE `name` = u_name);
    IF u_id IS NULL THEN
      INSERT INTO `game_users` (name) VALUES (u_name);
      SET u_id = LAST_INSERT_ID();
    END IF;

    /* SETTING summaries ids */
    SET us_id = CONCAT(u_id, sep, a_id);
    SET usd_id = CONCAT(a_day, sep, u_id, sep, a_id);
    SET usm_id = CONCAT(a_month, sep, u_id, sep, a_id);

    /* SAVING action AND UPDATING summaries */
    SET autocommit = 0;
    START TRANSACTION;
      INSERT INTO `game_users_actions` (`date`, `user_id`, `action_id`, `chars`, `url`, `ip`)
        VALUES (a_date, u_id, a_id, a_chars, url, ip);
      INSERT INTO `game_users_summaries` (`id`, `user_id`, `action_id`, `qty`, `chars`)
        VALUES (us_id, u_id, a_id, 1, a_chars)
        ON DUPLICATE KEY UPDATE qty = qty + 1, chars = chars + a_chars;
      INSERT INTO `game_users_summaries_days` (`id`, `day`, `user_id`, `action_id`, `qty`, `chars`)
        VALUES (usd_id, a_day, u_id, a_id, 1, a_chars)
        ON DUPLICATE KEY UPDATE qty = qty + 1, chars = chars + a_chars;
      INSERT INTO `game_users_summaries_months` (`id`, `month`, `user_id`, `action_id`, `qty`, `chars`)
        VALUES (usm_id, a_month, u_id, a_id, 1, a_chars)
        ON DUPLICATE KEY UPDATE qty = qty + 1, chars = chars + a_chars;   

      IF error = 1 THEN
        SELECT 'error';
        ROLLBACK;
        LEAVE lbl_proc;
      ELSE
        SELECT 'success';
        COMMIT;
      END IF;
  END

Anyway, I still think there's some kind of a bug in MySQL, but I consider problem solved.

Upvotes: 1

Related Questions