CodeShaman
CodeShaman

Reputation: 2189

MySQL stored procedure behaving unpredictably

Writing a stored procedure to parse names in format 'last first mid' and 'last first mid suffix' into a table.


CREATE PROCEDURE parse_full_name(full_name TEXT)
  BEGIN
    SET @last_name = SUBSTRING_INDEX(full_name, ' ', 1);
    SET @middle_name = SUBSTRING_INDEX(full_name, ' ', -1);
    SET @first_name = SUBSTR(full_name, LOCATE(' ', full_name), LENGTH(full_name) - LENGTH(@middle_name) - LENGTH(@last_name));

    SET @query = CONCAT('SELECT ''', @last_name,''' as last_name, ''', @first_name,''' as first_name, ''', @middle_name,''' as middle_name;');
    PREPARE stmt FROM @query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
  END;

CALL parse_full_name('last first middle');

>>>  last_name | first_name | middle_name
>>> -----------|------------|-------------
>>>  last      | first      | middle

CALL parse_full_name('last first middle suffix');

>>>  last_name | first_name   | middle_name
>>> -----------|--------------|-------------
>>>  last      | first middle | suffix

Does exactly what's expected. Now let's handle the suffix case:

CREATE PROCEDURE parse_full_name(full_name TEXT)
  BEGIN # assume full_name is 'last first middle' until proven not
    SET @last_name = SUBSTRING_INDEX(full_name, ' ', 1);
    SET @middle_name = SUBSTRING_INDEX(full_name, ' ', -1);
    SET @first_name = SUBSTR(full_name, LOCATE(' ', full_name), LENGTH(full_name) - LENGTH(@middle_name) - LENGTH(@last_name));

    # check if full_name is actually 'last first mid suffix'
    # if so then first_name will contain 'first middle'
    IF LOCATE(' ', @first_name) THEN # this if statement always evaluates to true
      SET @last_name = CONCAT(@last_name, ' ', @middle_name);
      SET @middle_name = SUBSTRING_INDEX(@first_name, ' ', -1); # mysql seems to skip this line of code
      SET @first_name = SUBSTRING_INDEX(@first_name, ' ', 1);   # this line too
    END IF;

    SET @query = CONCAT('SELECT ''', @last_name,''' as last_name, ''', @first_name,''' as first_name, ''', @middle_name,''' as middle_name;');
    PREPARE stmt FROM @query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
  END;

Results:

CALL parse_full_name('last first middle jr');

Expecting:

>>>  last_name | first_name | middle_name
>>> -----------|------------|-------------
>>>  last jr   | first      | middle

But get:

>>>  last_name | first_name | middle_name
>>> -----------|------------|-------------
>>>  last jr   |            |

and even better:

CALL parse_full_name('last first middle');

>>>  last_name   | first_name | middle_name
>>> -------------|------------|-------------
>>>  last middle |            |

So, where's the problem? I'm not a MySQL pro, but everything looks logically and intuitively correct.

Upvotes: 0

Views: 38

Answers (2)

CodeShaman
CodeShaman

Reputation: 2189

Rookie mistake. Just had to trim the strings.

SET @len_first_name = LENGTH(full_name) - LENGTH(@middle_name) - LENGTH(@last_name));
SET @first_name = SUBSTR(full_name, LOCATE(' ', full_name), @len_first_name);

@len_first_name grabs the space between first and middle, so @first_name ends up returning ' first middle', which would explain why LOCATE(' ', @first_name) was always returning true.


SET @last_name = TRIM(SUBSTRING_INDEX(full_name, ' ', 1));
SET @middle_name = TRIM(SUBSTRING_INDEX(full_name, ' ', -1));
SET @len_first_name = LENGTH(full_name) - LENGTH(@middle_name) - LENGTH(@last_name);
SET @first_name = TRIM(SUBSTR(full_name, LOCATE(' ', full_name), @len_first_name));

and it's working.

Upvotes: 1

Rick James
Rick James

Reputation: 142296

The task is unsolvable.

  • "Mary Jane" is a common first_name -- not first and middle.
  • A lot of Brits have a 2-word last names.
  • I have seen "Dear Mr. Iii" and "Dear Mr. Smith Iii". Technically "III" should be all caps, and preceded by a comma.
  • Will you mess up the case? (McDonald -> Mcdonald?) or apostrophe (O'Henry).
  • Some people have more than one middle_name.

You are setting yourself up to irritate people with names that don't follow the simplistic parsing rules.

Why do you need to parse the name? Can't you simply store and regurgitate the full_name?

Upvotes: 0

Related Questions