Greg
Greg

Reputation: 87

Extract parts of string separated by delimiter

This is an add on to my original question: Variable length substring between two characters

The data often looks like this, all in one column:

Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;

In the above example:

Upvotes: 1

Views: 2376

Answers (4)

vhoang
vhoang

Reputation: 1479

SQL Server 2016+

Concept using STRING_SPLIT(), PARSENAME(), PIVOT

-- Mimic Table named z_tbl_tmp
DECLARE @z_tbl_tmp TABLE (id INT, OPTIONAL_FIELD_1 NVARCHAR(max));
INSERT INTO @z_tbl_tmp VALUES (1, N'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;');
INSERT INTO @z_tbl_tmp VALUES (2, N'Growth: Run; Priority: Critical - Turns Contractual');
-- 

-- Pivot Parsed Data
WITH tbl_parsed AS (
    -- Parse Data into Key Value Pairs
    SELECT id, 
        TRIM(PARSENAME(REPLACE(value,': ','.'), 2)) AS K, 
        TRIM(PARSENAME(REPLACE(value,': ','.'), 1)) AS V 
    FROM @z_tbl_tmp
        CROSS APPLY STRING_SPLIT(OPTIONAL_FIELD_1,';')
)
SELECT id, [Growth] AS GROWTH_TXT, [Priority] AS PRIORITY_TXT
FROM tbl_parsed
    PIVOT (MAX(V) FOR [K] IN ([Growth], [Priority])) AS pvt
+----+------------+-------------------------------+
| id | GROWTH_TXT | PRIORITY_TXT                  |
+----+------------+-------------------------------+
|  1 | Compliance | Contractual                   |
+----+------------+-------------------------------+
|  2 | Run        | Critical - Turns Contractual  |
+----+------------+-------------------------------+

Upvotes: 2

Zhorov
Zhorov

Reputation: 29943

From SQL Server 2016, a combination of STRING_SPLIT(), PATINDEX() and conditional aggregation is an option:

DECLARE @text varchar(1000) = 'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;'

SELECT 
   MAX(CASE WHEN PATINDEX('Growth:%', [value]) = 1 THEN STUFF([value], 1, LEN('Growth:'), '') END) AS GROWTH_TXT,
   MAX(CASE WHEN PATINDEX('Priority:%', [value]) = 1 THEN STUFF([value], 1, LEN('Priority:'), '') END) AS PRIORITY_TXT
FROM STRING_SPLIT(@text, ';')

Result:

GROWTH_TXT  PRIORITY_TXT
 Compliance  Contractual

If the data is stored in a table, you need an additional APPLY operator:

DECLARE @text varchar(1000) = 'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;'
SELECT @text AS OPTIONAL_FIELD_1
INTO z_tbl_temp

SELECT a.*
FROM z_tbl_temp z
OUTER APPLY ( 
   SELECT 
      MAX(CASE WHEN PATINDEX('Growth:%', [value]) = 1 THEN STUFF([value], 1, LEN('Growth:'), '') END) AS GROWTH_TXT,
      MAX(CASE WHEN PATINDEX('Priority:%', [value]) = 1 THEN STUFF([value], 1, LEN('Priority:'), '') END) AS PRIORITY_TXT
   FROM STRING_SPLIT(z.OPTIONAL_FIELD_1, ';')
) a

Upvotes: 1

Yitzhak Khabinsky
Yitzhak Khabinsky

Reputation: 22177

Another approach based on JSON.

SQL Server 2016/2017 onwards.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (id INT IDENTITY PRIMARY KEY, val NVARCHAR(255));
INSERT INTO @tbl VALUES
(N'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;')
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = ';'
    , @separatorJson CHAR(3) = '","'
    , @colon CHAR(2) = ': '
    , @colonJson CHAR(3) = '":"';

;WITH rs AS
(
    SELECT id
          , N'{"' + 
             REPLACE(REPLACE(TRIM(';' FROM val),@colon, @colonJson), @separator, @separatorJson) + 
                N'"}' AS DataJson
    FROM @tbl
)
SELECT id
    , JSON_VALUE(DataJson, N'$.Growth') AS [Growth]
    , JSON_VALUE(DataJson, N'$.Priority') AS [Priority]
    , JSON_VALUE(DataJson, N'$."Original Vendor"') AS [Original Vendor]
FROM rs;

Output

+----+------------+-------------+-----------------+
| id |   Growth   |  Priority   | Original Vendor |
+----+------------+-------------+-----------------+
|  1 | Compliance | Contractual | ABC SERVICES    |
+----+------------+-------------+-----------------+

For SQL Server 2016 because it doesn't support updated TRIM() function:

REPLACE(REPLACE(LEFT(val,LEN(val)-1),@colon, @colonJson), @separator, @separatorJson) + 

Upvotes: 0

Rajat
Rajat

Reputation: 5803

If the string column follows a consistent pattern, something along these lines should do it.

with cte(str,i1,i2,i3) as

(select str, 
        charindex('Growth: ',str), 
        charindex('Priority: ',str), 
        charindex('Original Vendor: ',str)
 from your_table) 

 select substring(str,i1+8,i2-i1-9), substring(str,i2+9,i3-i2-10) 
 from cte; 

The arithmetic with 8, 9 and 10 in the substring is to get rid of unwanted characters that I assumed to be fixed in size.


Another method uses string_split with a cross apply

select str, 
       replace(min(value),'Growth: ','') growth_txt,
       replace(max(value),'Priority: ','') as priority_txt
from your_table
cross apply string_split(str,';')
where value like 'Growth: %' or value like 'Priority: %'
group by str;

Since letter G comes before letter P, the use of min and max ensures the replace function is being used on the right string. The way @Zhorov's answer handles that is probably more reliable in case you decide to parse a different set of elements and you don't want to think too much about the alphabetical order.

Upvotes: 0

Related Questions