Reputation: 87
This is an add on to my original question: Variable length substring between two characters
The data often looks like this, all in one column:
Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;
In the above example:
GROWTH_TXT
columnPRIORITY_TXT
columnUpvotes: 1
Views: 2376
Reputation: 1479
SQL Server 2016+
Concept using STRING_SPLIT(), PARSENAME(), PIVOT
-- Mimic Table named z_tbl_tmp
DECLARE @z_tbl_tmp TABLE (id INT, OPTIONAL_FIELD_1 NVARCHAR(max));
INSERT INTO @z_tbl_tmp VALUES (1, N'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;');
INSERT INTO @z_tbl_tmp VALUES (2, N'Growth: Run; Priority: Critical - Turns Contractual');
--
-- Pivot Parsed Data
WITH tbl_parsed AS (
-- Parse Data into Key Value Pairs
SELECT id,
TRIM(PARSENAME(REPLACE(value,': ','.'), 2)) AS K,
TRIM(PARSENAME(REPLACE(value,': ','.'), 1)) AS V
FROM @z_tbl_tmp
CROSS APPLY STRING_SPLIT(OPTIONAL_FIELD_1,';')
)
SELECT id, [Growth] AS GROWTH_TXT, [Priority] AS PRIORITY_TXT
FROM tbl_parsed
PIVOT (MAX(V) FOR [K] IN ([Growth], [Priority])) AS pvt
+----+------------+-------------------------------+
| id | GROWTH_TXT | PRIORITY_TXT |
+----+------------+-------------------------------+
| 1 | Compliance | Contractual |
+----+------------+-------------------------------+
| 2 | Run | Critical - Turns Contractual |
+----+------------+-------------------------------+
Upvotes: 2
Reputation: 29943
From SQL Server 2016, a combination of STRING_SPLIT()
, PATINDEX()
and conditional aggregation is an option:
DECLARE @text varchar(1000) = 'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;'
SELECT
MAX(CASE WHEN PATINDEX('Growth:%', [value]) = 1 THEN STUFF([value], 1, LEN('Growth:'), '') END) AS GROWTH_TXT,
MAX(CASE WHEN PATINDEX('Priority:%', [value]) = 1 THEN STUFF([value], 1, LEN('Priority:'), '') END) AS PRIORITY_TXT
FROM STRING_SPLIT(@text, ';')
Result:
GROWTH_TXT PRIORITY_TXT
Compliance Contractual
If the data is stored in a table, you need an additional APPLY
operator:
DECLARE @text varchar(1000) = 'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;'
SELECT @text AS OPTIONAL_FIELD_1
INTO z_tbl_temp
SELECT a.*
FROM z_tbl_temp z
OUTER APPLY (
SELECT
MAX(CASE WHEN PATINDEX('Growth:%', [value]) = 1 THEN STUFF([value], 1, LEN('Growth:'), '') END) AS GROWTH_TXT,
MAX(CASE WHEN PATINDEX('Priority:%', [value]) = 1 THEN STUFF([value], 1, LEN('Priority:'), '') END) AS PRIORITY_TXT
FROM STRING_SPLIT(z.OPTIONAL_FIELD_1, ';')
) a
Upvotes: 1
Reputation: 22177
Another approach based on JSON.
SQL Server 2016/2017 onwards.
SQL
-- DDL and sample data population, start
DECLARE @tbl TABLE (id INT IDENTITY PRIMARY KEY, val NVARCHAR(255));
INSERT INTO @tbl VALUES
(N'Growth: Compliance;Priority: Contractual;Original Vendor: ABC SERVICES;')
-- DDL and sample data population, end
DECLARE @separator CHAR(1) = ';'
, @separatorJson CHAR(3) = '","'
, @colon CHAR(2) = ': '
, @colonJson CHAR(3) = '":"';
;WITH rs AS
(
SELECT id
, N'{"' +
REPLACE(REPLACE(TRIM(';' FROM val),@colon, @colonJson), @separator, @separatorJson) +
N'"}' AS DataJson
FROM @tbl
)
SELECT id
, JSON_VALUE(DataJson, N'$.Growth') AS [Growth]
, JSON_VALUE(DataJson, N'$.Priority') AS [Priority]
, JSON_VALUE(DataJson, N'$."Original Vendor"') AS [Original Vendor]
FROM rs;
Output
+----+------------+-------------+-----------------+
| id | Growth | Priority | Original Vendor |
+----+------------+-------------+-----------------+
| 1 | Compliance | Contractual | ABC SERVICES |
+----+------------+-------------+-----------------+
For SQL Server 2016 because it doesn't support updated TRIM()
function:
REPLACE(REPLACE(LEFT(val,LEN(val)-1),@colon, @colonJson), @separator, @separatorJson) +
Upvotes: 0
Reputation: 5803
If the string column follows a consistent pattern, something along these lines should do it.
with cte(str,i1,i2,i3) as
(select str,
charindex('Growth: ',str),
charindex('Priority: ',str),
charindex('Original Vendor: ',str)
from your_table)
select substring(str,i1+8,i2-i1-9), substring(str,i2+9,i3-i2-10)
from cte;
The arithmetic with 8, 9 and 10 in the substring
is to get rid of unwanted characters that I assumed to be fixed in size.
Another method uses string_split
with a cross apply
select str,
replace(min(value),'Growth: ','') growth_txt,
replace(max(value),'Priority: ','') as priority_txt
from your_table
cross apply string_split(str,';')
where value like 'Growth: %' or value like 'Priority: %'
group by str;
Since letter G comes before letter P, the use of min
and max
ensures the replace
function is being used on the right string. The way @Zhorov's answer handles that is probably more reliable in case you decide to parse a different set of elements and you don't want to think too much about the alphabetical order.
Upvotes: 0