Reputation: 3484
I have a table in the following format where COL1 contains a unique identifier and COL2 contains a collection of phone numbers followed by a tag (<abc>
or <def>
) and delimited by pipe (|
). Number of phone records in each row is unknown - it may contain just one phone number followed by tag or upto 10.
Table
----------
COL1 : COL2
----------
ID1 : 1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>
I need to copy this data into a new table with the result in following format i.e. remove all portion of the string with the tag <def>
.
Table
----------
COL1 : COL2
----------
ID1 : 1234567890<abc>,4312314124<abc>,4131234131<abc>
What are the best ways to do this to get optimum performance? I need the program to transform data in a table that contains about a million records.
Upvotes: 0
Views: 48
Reputation: 2316
I didn't understand your question at first.but for answer you can you following code if you sql server is 2016 or upper.I think it has a good performance
Insert into table2 (ID1)
SELECT
STUFF((SELECT [value] +N',' AS 'data()' FROM STRING_SPLIT(ID1,'|') WHERE [value] LIKE'%<abc>' FOR XML PATH(''),TYPE)
.value('text()[1]','nvarchar(max)'),1,2,N'') AS ID1
FROM
table1
Upvotes: -1
Reputation: 7918
If performance is important then I would suggest delimitedSplit8k_Lead. You can just use the pipe as a delimiter to split the string then exclude items (tokens) that don't end with .
DECLARE @table TABLE (COL1 VARCHAR(10), COL2 VARCHAR(1000));
INSERT @table
VALUES
('ID1','1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>'),
('ID2','2662314129<abc>|7868845133<abc>|6831234131<abc>|41234139999<xxx>|1234567999<abc>')
SELECT t.COL1, ds.item
FROM @table t
CROSS APPLY dbo.DelimitedSplit8K_LEAD(t.COL2,'|') ds
WHERE ds.Item LIKE '%<abc>';
Returns
COL1 item
---------- -----------------
ID1 1234567890<abc>
ID1 4312314124<abc>
ID1 4131234131<abc>
ID2 2662314129<abc>
ID2 7868845133<abc>
ID2 6831234131<abc>
ID2 1234567999<abc>
Then you use XML PATH for concatenation like this:
DECLARE @table TABLE (COL1 VARCHAR(10), COL2 VARCHAR(1000));
INSERT @table
VALUES
('ID1','1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>'),
('ID2','2662314129<abc>|7868845133<abc>|6831234131<abc>|41234139999<xxx>|1234567999<abc>')
SELECT t.COL1, stripBadNumbers.newString
FROM @table t
CROSS APPLY
(VALUES((
SELECT ds.item
FROM dbo.DelimitedSplit8K_LEAD(t.COL2,'|') ds
WHERE ds.Item LIKE '%<abc>'
FOR XML PATH(''), TYPE
).value('.', 'varchar(1000)'))) stripBadNumbers(newString);
Returns:
COL1 newString
---------- -------------------------------------------------------------------
ID1 1234567890<abc>4312314124<abc>4131234131<abc>
ID2 2662314129<abc>7868845133<abc>6831234131<abc>1234567999<abc>
Upvotes: 1
Reputation: 37472
That string of yours can easily be transformed into some XML basically using replace()
. The phone numbers with the right tag can then be selected using XQuery. As a bonus this might work with an arbitrary number of phone numbers.
(I don't get your schema, so I use my own. Translate it into yours yourself.)
CREATE TABLE elbat
(nmuloc nvarchar(MAX));
INSERT INTO elbat
(nmuloc)
VALUES ('1234567890<abc>|4312314124<abc>|1232345133<def>|4131234131<abc>|41234134132<def>');
WITH
cte AS
(
SELECT convert(xml,
concat('<phonenumbers><phonenumber number="',
replace(replace(substring(nmuloc,
1,
len(nmuloc) - 1),
'<',
'" tag="'),
'>|',
'"/><phonenumber number="'),
'"/></phonenumbers>')) phonenumbers
FROM elbat
)
SELECT stuff((SELECT ',' + nodes.node.value('concat(./@number, "<", ./@tag, ">")',
'nvarchar(max)')
FROM cte
CROSS APPLY phonenumbers.nodes('/phonenumbers/phonenumber[@tag="abc"]') nodes(node)
FOR XML PATH(''),
TYPE).value('(.)[1]',
'nvarchar(max)'),
1,
1,
'');
But while you're at it you should really consider to normalize your schema and don't use delimiter separated lists and the also non atomic number and tag combination in a string anymore!
Upvotes: 1