stunnie
stunnie

Reputation: 133

How to remove characters from a column not like specific value

I'm working with a mssql environment, and i've uploaded a CSV file with first/lastnames/initials all in one column. The data is pretty messed up, format-wise, but it's all I have to work with. I need to extract the initials into a seperate column, and the format I need to extract is (example);

Simply said, the initial (B.) can be pretty much in any place. For some other names it's multiple letters seperated by dots (john b.l. smith, etc). What's the best way to take these values (just initials) into a seperate column? Regards

Upvotes: 1

Views: 75

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521093

Here is a solution which works for initials of any length, and in any position. The trick is to operate on the string in reverse, and find the position of the dot, and also the position of the first space which occurs after that dot. To handle the edge case of an initial beginning the entire name, I add artificial spaces on the front and end of the name strings.

WITH yourTable AS (
    SELECT 'JOHN B. SMITH' AS name UNION ALL
    SELECT 'SMITH JOHN B.' UNION ALL
    SELECT 'B. JOHN SMITH' UNION ALL
    SELECT 'HARRY BP. POTTER' UNION ALL
    SELECT 'AB. CHARLES ROGERS' UNION ALL
    SELECT 'JOHN DENNIS HRM.'
),
newTable AS (
    SELECT
        name AS orig_name,                     -- original name
        REVERSE(' ' + name + ' ') AS rev_name  -- reverse name, WS padded at start/end
    FROM yourTable
)

SELECT
    orig_name AS name,
    CASE WHEN orig_name LIKE '%.%'
         THEN
             REVERSE(SUBSTRING(rev_name,
                               CHARINDEX('.', rev_name) + 1,
                               CHARINDEX(' ', rev_name, CHARINDEX('.', rev_name)) -
                                   CHARINDEX('.', rev_name) - 1))
         ELSE 'NA' END AS initial
FROM newTable;

enter image description here

Demo

Upvotes: 1

Serkan Arslan
Serkan Arslan

Reputation: 13393

you can try this.

DECLARE @T TABLE (Name VARCHAR(50))
INSERT INTO @T VALUES
  ('JOHN B. SMITH'),
  ('SMITH JOHN B.'),
  ('B. JOHN SMITH')

  SELECT *, SUBSTRING(Name, CHARINDEX('.',Name)-1,1) INI 
  FROM @T

Result

Name                 INI
-------------------- ----
JOHN B. SMITH        B
SMITH JOHN B.        B
B. JOHN SMITH        B

Upvotes: 1

Related Questions