ZajcTeGleda
ZajcTeGleda

Reputation: 89

T-SQL removing multiple characters from string after delimiters

String that passes to stored procedure as variable looks like this:

'10273955=1|10142823=5|10664263=10|10134335=3|10046639=3|10334724=25|10334725=100'

I'm looking for a fast way to parse it to:

'10273955|10142823|10664263|10134335|10046639|10334724|10334725'

SQL server is 2016

Upvotes: 0

Views: 197

Answers (3)

Alan Burstein
Alan Burstein

Reputation: 7918

Using Ngrams8k you could do this:

DECLARE @string VARCHAR(1000) = 
  '10273955=1|10142823=5|10664263=10|10134335=3|10046639=3|10334724=25|10334725=100';

SELECT NewString = 
(
  SELECT CASE SIGN(s.position) WHEN 1 THEN '' ELSE ng.token END
  FROM   dbo.NGrams8k(@string,1) AS ng
  LEFT JOIN 
  (
    SELECT ng.position, nxt.Pos, ln=nxt.Pos-ng.position
    FROM   dbo.ngrams8k(@string,1) AS ng
    CROSS APPLY (VALUES(
      ISNULL(NULLIF(CHARINDEX('|',@string, ng.position+1),0),LEN(@string)+1))) AS nxt(Pos)
    WHERE  ng.token = '='
  ) AS s ON ng.position BETWEEN s.position AND s.pos-1
  ORDER BY ng.position ASC
  FOR XML PATH('')
);

Returns:

NewString
---------------------------------------------------------------------------------
10273955|10142823|10664263|10134335|10046639|10334724|10334725

Upvotes: 0

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67291

My suggestion was this:

A mockup table to simulate your issue

DECLARE @tbl TABLE(ID INT IDENTITY, YourString VARCHAR(250));
INSERT INTO @tbl VALUES('10273955=1|10142823=5|10664263=10|10134335=3|10046639=3|10334724=25|10334725=100');

--The query

SELECT t.ID
      ,t.YourString
      ,A.CastedToXml
      ,REPLACE(A.CastedToXml.query('data(/x/y[1])').value('.','varchar(150)'),' ','|')
FROM @tbl t
CROSS APPLY(SELECT CAST('<x><y>' + REPLACE(REPLACE(t.YourString,'|','</y></x><x><y>'),'=','</y><y>') + '</y></x>' AS XML)) A(CastedToXml);

The result

10273955|10142823|10664263|10134335|10046639|10334724|10334725

The idea in short:

The APPLY will use some replacements to transform a double-separated XML in one go. It will look like this:

<x>
  <y>10273955</y>
  <y>1</y>
</x>
<x>
  <y>10142823</y>
  <y>5</y>
</x>
<x>
  <y>10664263</y>
  <y>10</y>
</x>
<x>
  <y>10134335</y>
  <y>3</y>
</x>
<x>
  <y>10046639</y>
  <y>3</y>
</x>
<x>
  <y>10334724</y>
  <y>25</y>
</x>
<x>
  <y>10334725</y>
  <y>100</y>
</x>

The trick is the usage of XQuery's data(), which returns all values in the XPath as blank separated fragments. Using a XPath of /x/y[1] tells the engine: Pick each <x> and the first <y> you find in there!.

Btw: In XML the sort order is fix. So the returned string will not change this order.

Upvotes: 3

Thom A
Thom A

Reputation: 95564

Using DelimitedSplit8K_LEAD and FOR XML PATH (as 2017 introduces STRING_AGG) you could do this and retain the ordering, but the real solution is to stop storing delimited delimited data (yes, I did mean to say delimited twice, as it's | and = delimited) in your tables...

SELECT YT.YourColumn,
       STUFF((SELECT '|' + LEFT(DS.Item,CHARINDEX('=',DS.item)-1)
              FROM dbo.DelimitedSplit8K_LEAD(YT.YourColumn,'|') DS
              ORDER BY DS.ItemNumber
              FOR XML PATH(''),TYPE).value('.','varchar(8000)'),1,1,'') AS NewColumn        
FROM (VALUES('10273955=1|10142823=5|10664263=10|10134335=3|10046639=3|10334724=25|10334725=100'))YT(YourColumn)

But, like i said, fix your design. you could do that with DelimitedSplit8K_LEAD again:

SELECT DS1.ItemNumber AS ID,
       CONVERT(int,MAX(CASE DS2.ItemNumber WHEN 1 THEN DS2.Item END)) AS LongNumber,
       CONVERT(int,MAX(CASE DS2.ItemNumber WHEN 2 THEN DS2.Item END)) AS ShortNumber
FROM (VALUES('10273955=1|10142823=5|10664263=10|10134335=3|10046639=3|10334724=25|10334725=100'))YT(YourColumn)
     CROSS APPLY dbo.DelimitedSplit8K_LEAD(YT.YourColumn,'|') DS1
     CROSS APPLY dbo.DelimitedSplit8K_LEAD(DS1.Item,'=') DS2
GROUP BY DS1.ItemNumber;

Upvotes: 0

Related Questions