Fuuhhss
Fuuhhss

Reputation: 33

Retrieve initials from a SQL Server Table

I've been working on treating a sql table, and splitting the data. I've come to splitting some initials from the last name. The only problem is, the initials are spaced out. For example (data from my table)

So basically, it's up to four initials, that can be either at the begin, middle, or end of the string. I'm at a loss as to how I should import these. into a seperate column where the result will be:

COL A | COL B
J S P | Jansen
B D V | Gerson
J D   | Timberland

Can anyone please point me in the right direction? I'm using SQL Server.

Upvotes: 3

Views: 1424

Answers (5)

Salman Arshad
Salman Arshad

Reputation: 272066

This one uses CHARINDEX and recursive CTE to extract space delimited substrings from name:

  • Find the substring before the first space
  • Feed the remaining substring to the same CTE

Once you have the substrings, it is only a matter of gluing them back:

WITH yourdata(FullName) AS (
    SELECT 'Hanse J S P' UNION
    SELECT 'Gerson B D V' UNION
    SELECT 'J D Timberland' UNION
    SELECT 'TEST 1 TEST 2 TEST 3'
), cte AS (
    SELECT
        FullName,
        CASE WHEN Pos1 = 0 THEN FullName ELSE SUBSTRING(FullName, 1, Pos1 - 1) END AS LeftPart,
        CASE WHEN Pos1 = 0 THEN Null     ELSE SUBSTRING(FullName, Pos1 + 1, Pos2 - Pos1) END AS NextPart,
        1 AS PartSort
    FROM yourdata
    CROSS APPLY (SELECT CHARINDEX(' ', FullName) AS Pos1, LEN(FullName) AS Pos2) AS CA
    UNION ALL
    SELECT
        FullName,
        CASE WHEN Pos1 = 0 THEN NextPart ELSE SUBSTRING(NextPart, 1, Pos1 - 1) END,
        CASE WHEN Pos1 = 0 THEN Null     ELSE SUBSTRING(NextPart, Pos1 + 1, Pos2 - Pos1) END,
        PartSort + 1
    FROM cte
    CROSS APPLY (SELECT CHARINDEX(' ', NextPart) AS Pos1, LEN(NextPart) AS Pos2) AS CA
    WHERE NextPart IS NOT NULL
)
SELECT yourdata.FullName, STUFF(CA1.XMLStr, 1, 1, '') AS Initials, STUFF(CA2.XMLStr, 1, 1, '') AS Names
FROM yourdata
CROSS APPLY (
    SELECT CONCAT(' ', LeftPart)
    FROM cte
    WHERE FullName = yourdata.FullName AND LEN(LeftPart) = 1
    ORDER BY PartSort
    FOR XML PATH('')
) AS CA1(XMLStr)
CROSS APPLY (
    SELECT CONCAT(' ', LeftPart)
    FROM cte
    WHERE FullName = yourdata.FullName AND LEN(LeftPart) > 1
    ORDER BY PartSort
    FOR XML PATH('')
) AS CA2(XMLStr)

Result:

| FullName             | Initials | Names          |
|----------------------|----------|----------------|
| Gerson@B@D@V         | B D V    | Gerson         |
| Hanse@J@S@P          | J S P    | Hanse          |
| J@D@Timberland       | J D      | Timberland     |
| TEST@1@TEST@2@TEST@3 | 1 2 3    | TEST TEST TEST |

Upvotes: 1

Zorkolot
Zorkolot

Reputation: 2017

I used some built-in functions for this. The general idea is to use string_split to split the string into rows, use ROW_NUMBER to save the order according to length and the char(s) position in the string, then use FOR XML PATH() to concatenate from rows to a single column.

--Assume your data structure
DECLARE @temp TABLE (thestring varchar(1000))
INSERT INTO @temp VALUES
 ('Hanse J S P'), ('Gerson B D V'), ('J D Timberland')

;WITH CTE AS
(
    SELECT *
        ,ROW_NUMBER() OVER (PARTITION BY thestring ORDER BY thestring, LEN(value) ASC, pos ASC) [order]
        FROM (
                SELECT *      
                    , value AS [theval]
                    , CHARINDEX(CASE WHEN len(value) = 1 THEN ' ' + value ELSE value END, thestring) AS [pos]
                FROM @temp CROSS APPLY string_split(thestring, ' ')
            )  AS dT
)
SELECT ( SELECT value + ' ' AS [text()]
                 FROM cte 
                WHERE cte.thestring = T.thestring
                  AND LEN(theval) = 1
                FOR XML PATH('')
       ) AS [COL A]
      ,( SELECT value + ' ' AS [text()]
                 FROM cte 
                WHERE cte.thestring = T.thestring
                  AND LEN(theval) > 1
                FOR XML PATH('')
       ) AS [COL B]
  FROM @temp T 
GROUP BY thestring

Produces output:

COL A   COL B
-----   -----
B D V   Gerson 
J S P   Hanse 
J D     Timberland 

Upvotes: 1

John Cappelletti
John Cappelletti

Reputation: 81930

Similar to JNevil's answer (+1), but not limited to 4 tokens.

Example

Declare @YourTable table (SomeCol varchar(50))
Insert Into @YourTable values
 ('Hanse J S P')
,('Gerson B D V')
,('J D Timberland')
,('J D Timberland / J R R Tolkien')


Select A.SomeCol
      ,ColA = ltrim(
              concat(IIF(len(Pos1)=1,' '+Pos1,null)
                    ,IIF(len(Pos2)=1,' '+Pos2,null)
                    ,IIF(len(Pos3)=1,' '+Pos3,null)
                    ,IIF(len(Pos4)=1,' '+Pos4,null)
                    ,IIF(len(Pos5)=1,' '+Pos5,null)
                    ,IIF(len(Pos6)=1,' '+Pos6,null)
                    ,IIF(len(Pos7)=1,' '+Pos7,null)
                    ,IIF(len(Pos8)=1,' '+Pos8,null)
                    ,IIF(len(Pos9)=1,' '+Pos9,null)
                    )
              )
      ,ColB = ltrim(
              concat(IIF(Pos1 not Like '[a-z]',' '+Pos1,null)
                    ,IIF(Pos2 not Like '[a-z]',' '+Pos2,null)
                    ,IIF(Pos3 not Like '[a-z]',' '+Pos3,null)
                    ,IIF(Pos4 not Like '[a-z]',' '+Pos4,null)
                    ,IIF(Pos5 not Like '[a-z]',' '+Pos5,null)
                    ,IIF(Pos6 not Like '[a-z]',' '+Pos6,null)
                    ,IIF(Pos7 not Like '[a-z]',' '+Pos7,null)
                    ,IIF(Pos8 not Like '[a-z]',' '+Pos8,null)
                    ,IIF(Pos9 not Like '[a-z]',' '+Pos9,null)
                    )
              )
 From  @YourTable A
 Cross Apply (
                Select Pos1 = xDim.value('/x[1]','varchar(max)')
                      ,Pos2 = xDim.value('/x[2]','varchar(max)')
                      ,Pos3 = xDim.value('/x[3]','varchar(max)')
                      ,Pos4 = xDim.value('/x[4]','varchar(max)')
                      ,Pos5 = xDim.value('/x[5]','varchar(max)')
                      ,Pos6 = xDim.value('/x[6]','varchar(max)')
                      ,Pos7 = xDim.value('/x[7]','varchar(max)')
                      ,Pos8 = xDim.value('/x[8]','varchar(max)')
                      ,Pos9 = xDim.value('/x[9]','varchar(max)')
                From  (Select Cast('<x>' + replace(SomeCol,' ','</x><x>')+'</x>' as xml) as xDim) as A 
             ) B

Returns

SomeCol                           ColA            ColB
Hanse J S P                       J S P           Hanse
Gerson B D V                      B D V           Gerson
J D Timberland                    J D             Timberland
J D Timberland / J R R Tolkien    J D / J R R     Timberland / Tolkien

Upvotes: 1

JNevill
JNevill

Reputation: 50034

Here's a rather hamfisted way of doing it by abusing the Parsename function. The big caveat here is that Parsename is limited to 4 tokens so J S P Jansen will work but J S P C Jansen or John J S P Jansen will not.

With parsedname AS
(
  SELECT
      PARSENAME(replace(name, ' ', '.'), 1) name1,
      PARSENAME(replace(name, ' ', '.'), 2) name2,
      PARSENAME(replace(name, ' ', '.'), 3) name3,
      PARSENAME(replace(name, ' ', '.'), 4) name4
   FROM yourtable
)
SELECT 
   CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
   CASE WHEN LEN(name1) > 1 THEN name1 
      WHEN LEN(name2) > 1 THEN name2
      WHEN LEN(name3) > 1 THEN name3
      WHEN LEN(name4) > 1 THEN name4
      END as surname
FROM parsedname

Here is a sqlfiddle of this in action

CREATE TABLE NAMES (name varchar(50));
INSERT INTO NAMES VALUES ('J S P Jansen');
INSERT INTO NAMES VALUES ('B D V Gerson');
INSERT INTO NAMES VALUES ('J D Timberland');

With parsedname AS
(
  SELECT
      PARSENAME(replace(name, ' ', '.'), 1) name1,
      PARSENAME(replace(name, ' ', '.'), 2) name2,
      PARSENAME(replace(name, ' ', '.'), 3) name3,
      PARSENAME(replace(name, ' ', '.'), 4) name4
   FROM names
)
SELECT 
   CASE WHEN LEN(name4) = 1 THEN name4 ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 ELSE '' END as initials,
   CASE WHEN LEN(name1) > 1 THEN name1 
      WHEN LEN(name2) > 1 THEN name2
      WHEN LEN(name3) > 1 THEN name3
      WHEN LEN(name4) > 1 THEN name4
      END as surname
FROM parsedname

+----------+------------+
| initials |  surname   |
+----------+------------+
| JSP      | Jansen     |
| BDV      | Gerson     |
| JD       | Timberland |
+----------+------------+

If a space is needed in between those letters you can just flip around that CASE statement to something like:

TRIM(CASE WHEN LEN(name4) = 1 THEN name4 + ' ' ELSE '' END +
      CASE WHEN LEN(name3) = 1 THEN name3 + ' ' ELSE '' END +
      CASE WHEN LEN(name2) = 1 THEN name2 + ' ' ELSE '' END +
      CASE WHEN LEN(name1) = 1 THEN name1 + ' ' ELSE '' END) as initials

SQLFiddle with the spaces

+----------+------------+
| initials |  surname   |
+----------+------------+
| J S P    | Jansen     |
| B D V    | Gerson     |
| J D      | Timberland |
+----------+------------+

Upvotes: 2

NicVerAZ
NicVerAZ

Reputation: 407

Which version of SQL Server do you have? Is STRING_SPLIT() available?

If yes, split using the space as a delimiter, iterate through the resulting strings, evaluate their length and concatenate a result string with the string when said string is one character in length and is a letter.

Add a space before unless the result string is so far empty.

If STRING_SPLIT() is not available... Well... Here are a few solutions:

T-SQL split string based on delimiter

-- Addendum

To your second part of the question (which did not originally exist when I originally posted my reply) where you would like to isolate the non-initials part into a second column, I would basically separate two blocks of logic with two result strings based on the length of each element.

Note: this is not going to be very elegant in pre-2016 SQL Server and may even require a CURSOR (sigh)

I know I am going to be downvoted for mentioning a cursor.

Upvotes: 0

Related Questions