Reputation: 4854
I was wondering if anyone could help me with parsing a full name field. I would like to separate it into lastname, firstname, middle initial, suffix.
Here are some inputs for name followed by how I would like for them to be parsed.
Parsed Stuff Begins Here-------------------------------------
name | lastname | firstname | middle initial | suffix |
----------------------------------------------------------------------------------------
PUBLIC, JOHN | PUBLIC | JOHN | NULL | NULL
PUBLIC, CHUN CH KIM | PUBLIC | CHUN CH KIM | NULL | NULL
PUBLIC, MARY L | PUBLIC | MARY | L | NULL
PUBLIC, FRED J JR | PUBLIC | FRED | J | JR
PUBLIC, SUE ELLEN J SR | PUBLIC | SUE ELLEN | J | SR
I have a list of all the suffix values that one is able to enter, i.e.
JR, SR, I,II,III,IV,V,VI
I've gotten to a point where I split up the lastname and the rest of the name, but I can't quite figure out how to do the rest. I'm using oracle 10g.
This is not a homework question. It's an actual problem I'm working on at work.
Here's what I currently have:
select id,
name,
substr(name,1, instr(name,',')-1) as lname,
substr(name,(instr(name,',')+1),length(name)) as rest_of_the_name
from my_table
where status='A';
Upvotes: 0
Views: 5621
Reputation: 421
This post contained a pretty complete SQL-only solution, written in SQLServer syntax. I've converted it to Oracle syntax. It does a pretty good job, though it doesn't handle suffixes (Jr., III, etc). You should read the other caveats in the post:
SQL: parse the first, middle and last name from a fullname field
SELECT first_name.original_input_data,
first_name.title,
first_name.first_name,
CASE
WHEN 0 = INSTR(first_name.rest_of_name, ' ') THEN
NULL --no more spaces? assume rest is the last name
ELSE
SUBSTR(first_name.rest_of_name, 1, INSTR(first_name.rest_of_name, ' ') - 1)
END AS middle_name,
SUBSTR(first_name.rest_of_name,
1 + INSTR(first_name.rest_of_name, ' '),
LENGTH(first_name.rest_of_name)) AS last_name
FROM (SELECT title.title,
CASE
WHEN 0 = INSTR(title.rest_of_name, ' ') THEN
title.rest_of_name --No space? return the whole thing
ELSE
SUBSTR(title.rest_of_name, 1, INSTR(title.rest_of_name, ' ') - 1)
END AS first_name,
CASE
WHEN 0 = INSTR(title.rest_of_name, ' ') THEN
NULL --no spaces @ all? then 1st name is all we have
ELSE
SUBSTR(title.rest_of_name, INSTR(title.rest_of_name, ' ') + 1, LENGTH(title.rest_of_name))
END AS rest_of_name,
title.original_input_data
FROM (SELECT
--if the first three characters are in this list,
--then pull it as a "title". otherwise return NULL for title.
CASE
WHEN SUBSTR(test_data.full_name, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN
LTRIM(RTRIM(SUBSTR(test_data.full_name, 1, 3)))
ELSE
NULL
END AS title
--if you change the list, don't forget to change it here, too.
--so much for the DRY prinicple...
,
CASE
WHEN SUBSTR(test_data.full_name, 1, 3) IN ('MR ', 'MS ', 'DR ', 'MRS') THEN
LTRIM(RTRIM(SUBSTR(test_data.full_name, 4, LENGTH(test_data.full_name))))
ELSE
LTRIM(RTRIM(test_data.full_name))
END AS rest_of_name,
test_data.original_input_data
FROM (SELECT
--trim leading & trailing spaces before trying to process
--disallow extra spaces *within* the name
REPLACE(REPLACE(LTRIM(RTRIM(full_name)), ' ', ' '), ' ', ' ') AS full_name,
full_name AS original_input_data
FROM ( --if you use this, then replace the following
--block with your actual table
SELECT 'george w bush jr.' AS full_name
FROM dual
UNION
SELECT 'SUSAN B ANTHONY' AS full_name
FROM dual
UNION
SELECT 'alexander hamilton' AS full_name
FROM dual
UNION
SELECT 'osama bin laden jr' AS full_name
FROM dual
UNION
SELECT 'Martin J. VAN BUREN SENIOR III' AS full_name
FROM dual
UNION
SELECT 'Tommy' AS full_name
FROM dual
UNION
SELECT 'Billy' AS full_name
FROM dual
UNION
SELECT NULL AS full_name
FROM dual
UNION
SELECT ' ' AS full_name
FROM dual
UNION
SELECT ' JOHN JACOB SMITH' AS full_name
FROM dual
UNION
SELECT ' DR SANJAY GUPTA' AS full_name
FROM dual
UNION
SELECT 'DR JOHN S HOPKINS' AS full_name
FROM dual
UNION
SELECT ' MRS SUSAN ADAMS' AS full_name
FROM dual
UNION
SELECT ' MS AUGUSTA ADA SMITH-KING ' AS full_name
FROM dual) raw_data) test_data) title) first_name
Upvotes: 0
Reputation: 1
Select SUBSTR(name, INSTR(name, ' ')+1) AS lastname, SUBSTR(name, 1, INSTR(name, ' ')-1) AS firstname from whatever;
Upvotes: -2
Reputation: 1768
Here's an unsophisticated answer, based on the retrieving the first and last name, retrieving the MI the same way as the first name, removing the MI from 'rest_of_the_name' as the last name.
SELECT
substr('John Q. Public',1, instr('John Q. Public',' ')-1) as FirstName,
substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')) as rest_of_the_name,
substr(substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')),1, instr(substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')),' ')-1) as MI,
replace(substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')), substr(substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')),1, instr(substr('John Q. Public',(instr('John Q. Public',' ')+1),length('John Q. Public')),' ')-1)) as LastName
FROM DUAL;
Upvotes: 0
Reputation: 60292
You've partially solved it already - you can use your query as a subquery and break the problem down bit by bit, e.g.:
select id, name, lname,
case
when substr(x, -2, 1) = ' '
then substr(x, length(x) - 2)
else x
end as first_name, -- e.g. "SUE ELLEN"
case
when substr(x, -2, 1) = ' '
then substr(x, -1)
else null
end as middle_initial, -- e.g. "J"
suffix -- e.g. "SR"
from (
select id, name, lname, suffix,
case when suffix is not null then
substr(rest_of_the_name, 1, length(rest_of_the_name)-length(suffix)-1)
else rest_of_the_name end
as x -- e.g. "SUE ELLEN J"
from (
select id, name, lname, rest_of_the_name,
case
when substr(rest_of_the_name,-2)
in (' I',' V')
then substr(rest_of_the_name,-1)
when substr(rest_of_the_name,-3)
in (' JR',' SR',' II',' IV',' VI')
then substr(rest_of_the_name,-2)
when substr(rest_of_the_name,-4)
in (' III')
then substr(rest_of_the_name,-3)
else null
end as suffix -- e.g. "SR"
from (
select id,
name, --e.g. "PUBLIC, SUE ELLEN J SR"
trim(substr(name,1, instr(name,',')-1)) as lname, -- e.g. "PUBLIC"
trim(substr(name,(instr(name,',')+1),length(name)))
as rest_of_the_name -- e.g. "SUE ELLEN J SR"
from my_table
where status='A'
)));
Upvotes: 1
Reputation: 8459
This is a problem for which there will always be data that breaks it.
What if there are 2 initials? What if the initials are first, as J Edgar Hoover?
You mention values "that one is able to enter." Can you change the way the values are entered to capture them already separated?
Upvotes: 2