Reputation: 65
I'm required to split names to insert into the database. Some sample strings are given below.
Covey, Stephen J, Mr
Clinton, Hilary B,
Obama, Barack, Mr
I need to split the above string like follows.
First_Name Middle_Initial Last_NAME Title
Stephen J Covey Mr
Hilary B Clinton
Barack Obama Mr
I can achieve the above using java code. Is is possible to split those strings directly in a sql select.?
Please note that, Few names might not have title or middle initial. I need the output as given above.
Upvotes: 0
Views: 179
Reputation: 167867
Oracle 11g R2 Schema Setup:
CREATE TABLE Names ( Name ) AS
SELECT 'Covey, Stephen J, Mr' FROM DUAL
UNION ALL SELECT 'Clinton, Hilary B,' FROM DUAL
UNION ALL SELECT 'Obama, Barack, Mr' FROM DUAL
Query 1:
SELECT REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 1 ) AS Last_Name,
REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 2 ) AS First_Name,
REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 4 ) AS Middle_Initial,
REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 5 ) AS Title
FROM Names
| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL | TITLE |
|-----------|------------|----------------|--------|
| Covey | Stephen | J | Mr |
| Clinton | Hilary | B | (null) |
| Obama | Barack | (null) | Mr |
Query 2:
SELECT REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\1' ) AS Last_Name,
REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\2' ) AS First_Name,
REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\4' ) AS Middle_Initial,
REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\5' ) AS Title
FROM Names
| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL | TITLE |
|-----------|------------|----------------|--------|
| Covey | Stephen | J | Mr |
| Clinton | Hilary | B | (null) |
| Obama | Barack | (null) | Mr |
Query 3:
WITH Split_Names AS (
SELECT REGEXP_SUBSTR( Name, '^[^,]+' ) AS Last_Name,
REGEXP_REPLACE( Name, '^.*?,\s*|\s*,.*?$' ) AS Given_Names,
REGEXP_SUBSTR( Name, '[^\s,]+$' ) AS Title
FROM Names
)
SELECT Last_Name,
REGEXP_REPLACE( Given_Names, '\s+\w$' ) AS First_Name,
TRIM( REGEXP_SUBSTR( Given_Names, '\s+\w$' ) ) AS Middle_Initial,
Title
FROM Split_Names
| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL | TITLE |
|-----------|------------|----------------|--------|
| Covey | Stephen | J | Mr |
| Clinton | Hilary | B | (null) |
| Obama | Barack | (null) | Mr |
Upvotes: 2
Reputation: 10360
This is Alexander's answer modified with an improved regex that handles NULL list elements. Oh and instead of repeating that regex, make it reusable by creating a function as described here: REGEX to select nth value from a list, allowing for nulls, then call that. That way the code is encapsulated and reusable by all with only one place to change code if you have to:
SQL> with tbl(input_row) as (
select 'Covey, Stephen J, Mr' from dual
union
select 'Clinton,,Ms' from dual
union
select 'Obama, Barack, Mr' from dual
)
SELECT TRIM( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 1, NULL, 1)) AS Last_NAME,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 2, NULL, 1), '[^ ]+', 1, 1)) AS First_Name,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 2, NULL, 1), '[^ ]+', 1, 2)) AS Middle_Initial,
TRIM( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 3, NULL, 1)) AS Title
FROM tbl;
LAST_NAME FIRST_NAME MIDDLE_INITIAL TITLE
-------------------- -------------------- -------------------- --------------------
Clinton Ms
Covey Stephen J Mr
Obama Barack Mr
SQL>
Upvotes: 1
Reputation: 5399
Use regexp_substr (DB >=10g):
SELECT TRIM( REGEXP_SUBSTR(input_row, '[^,]+', 1, 1)) AS Last_NAME,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '[^,]+', 1, 2), '[^ ]+', 1, 1)) AS First_Name,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '[^,]+', 1, 2), '[^ ]+', 1, 2)) AS Middle_Initial,
TRIM( REGEXP_SUBSTR(input_row, '[^,]+', 1, 3)) AS Title
FROM source_table;
Upvotes: 1