SqlNerd
SqlNerd

Reputation: 65

How to split the given string as per requirement using oracle

I'm required to split names to insert into the database. Some sample strings are given below.

Covey, Stephen J, Mr
Clinton, Hilary B,
Obama, Barack, Mr

I need to split the above string like follows.

First_Name       Middle_Initial      Last_NAME     Title
 Stephen               J               Covey         Mr
 Hilary                B               Clinton
 Barack                                Obama         Mr

I can achieve the above using java code. Is is possible to split those strings directly in a sql select.?

Please note that, Few names might not have title or middle initial. I need the output as given above.

Upvotes: 0

Views: 179

Answers (3)

MT0
MT0

Reputation: 167867

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE Names ( Name ) AS
          SELECT 'Covey, Stephen J, Mr' FROM DUAL
UNION ALL SELECT 'Clinton, Hilary B,' FROM DUAL
UNION ALL SELECT 'Obama, Barack, Mr' FROM DUAL

Query 1:

SELECT REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 1 ) AS Last_Name,
       REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 2 ) AS First_Name,
       REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 4 ) AS Middle_Initial,
       REGEXP_SUBSTR( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', 1, 1, NULL, 5 ) AS Title
FROM   Names

Results:

| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL |  TITLE |
|-----------|------------|----------------|--------|
|     Covey |    Stephen |              J |     Mr |
|   Clinton |     Hilary |              B | (null) |
|     Obama |     Barack |         (null) |     Mr |

Query 2:

SELECT REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\1' ) AS Last_Name,
       REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\2' ) AS First_Name,
       REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\4' ) AS Middle_Initial,
       REGEXP_REPLACE( Name, '^(.*?),\s*(.*?)(\s+(\w))?,\s*(.*)$', '\5' ) AS Title
FROM   Names

Results:

| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL |  TITLE |
|-----------|------------|----------------|--------|
|     Covey |    Stephen |              J |     Mr |
|   Clinton |     Hilary |              B | (null) |
|     Obama |     Barack |         (null) |     Mr |

Query 3:

WITH Split_Names AS (
  SELECT REGEXP_SUBSTR( Name, '^[^,]+' ) AS Last_Name,
         REGEXP_REPLACE( Name, '^.*?,\s*|\s*,.*?$' ) AS Given_Names,
         REGEXP_SUBSTR( Name, '[^\s,]+$' ) AS Title
  FROM   Names
)
SELECT Last_Name,
       REGEXP_REPLACE( Given_Names, '\s+\w$' ) AS First_Name,
       TRIM( REGEXP_SUBSTR( Given_Names, '\s+\w$' ) ) AS Middle_Initial,
       Title
FROM   Split_Names

Results:

| LAST_NAME | FIRST_NAME | MIDDLE_INITIAL |  TITLE |
|-----------|------------|----------------|--------|
|     Covey |    Stephen |              J |     Mr |
|   Clinton |     Hilary |              B | (null) |
|     Obama |     Barack |         (null) |     Mr |

Upvotes: 2

Gary_W
Gary_W

Reputation: 10360

This is Alexander's answer modified with an improved regex that handles NULL list elements. Oh and instead of repeating that regex, make it reusable by creating a function as described here: REGEX to select nth value from a list, allowing for nulls, then call that. That way the code is encapsulated and reusable by all with only one place to change code if you have to:

SQL> with tbl(input_row) as (
   select 'Covey, Stephen J, Mr' from dual
   union
   select 'Clinton,,Ms' from dual
   union
   select 'Obama, Barack, Mr' from dual
   )
   SELECT TRIM( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 1, NULL, 1)) AS Last_NAME,
   TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 2, NULL, 1), '[^ ]+', 1, 1)) AS First_Name,
   TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 2, NULL, 1), '[^ ]+', 1, 2)) AS Middle_Initial,
   TRIM( REGEXP_SUBSTR(input_row, '([^,]*)(,|$)', 1, 3, NULL, 1)) AS Title
   FROM tbl;

LAST_NAME            FIRST_NAME           MIDDLE_INITIAL       TITLE
-------------------- -------------------- -------------------- --------------------
Clinton                                                        Ms
Covey                Stephen              J                    Mr
Obama                Barack                                    Mr

SQL>

Upvotes: 1

Alexander Ushakov
Alexander Ushakov

Reputation: 5399

Use regexp_substr (DB >=10g):

SELECT TRIM( REGEXP_SUBSTR(input_row, '[^,]+', 1, 1)) AS Last_NAME,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '[^,]+', 1, 2), '[^ ]+', 1, 1)) AS First_Name,
TRIM( REGEXP_SUBSTR( REGEXP_SUBSTR(input_row, '[^,]+', 1, 2), '[^ ]+', 1, 2)) AS Middle_Initial,
TRIM( REGEXP_SUBSTR(input_row, '[^,]+', 1, 3)) AS Title
FROM source_table;

Upvotes: 1

Related Questions