pastelrain
pastelrain

Reputation: 15

RegEx in SQL Server - Hyphen Delimited String

I'm needing to identify when a field has an "a" or "b" in the 2nd segment of a string delimited by hyphens.

For example:

ReferenceIDField:

CLC-04a-CCC-ILS

CLC-04b-CCC-ILS

RX-101a-ICP

RX-101b-ICP

I need to flag each value with the "a" or "b" in the second segment separately as

ReferenceIDFlag:

A group

B group

A group

B group

THEN I also need to create a new column for these values renaming them respectively as below.

NewReferenceIDField:

CLC-04

CLC-04

RX-101

RX-101

I have researched the regex patterns, but have had no luck finding examples that match my scenario. Any help is appreciated.

Upvotes: 0

Views: 188

Answers (2)

Radu Gheorghiu
Radu Gheorghiu

Reputation: 20509

You can use the below query to get the two column values:

create table #temp (ReferenceIDField varchar(30));
insert into #temp values ('CLC-04a-CCC-ILS')
                       , ('CLC-04b-CCC-ILS')
                       , ('RX-101a-ICP')
                       , ('RX-101b-ICP');

select ReferenceIDField
    , UPPER(right(left(secondpart, charindex('-', secondpart) - 1), 1)) + ' group' as ReferenceIDFlag
    , left(ReferenceIDField, CHARINDEX('-', ReferenceIDField, 0) + charindex('-', secondpart) - 2) as NewReferenceIDField
from (
select ReferenceIDField
    , RIGHT(ReferenceIDField, len(ReferenceIDField) - CHARINDEX('-', ReferenceIDField, 0)) secondpart
from #temp
) x

Output:

enter image description here

I'm sure you can find a solution by using PATINDEX, but I prefer to use LEFT and RIGHT functions, until performance is a problem.

Upvotes: 1

Doug Coats
Doug Coats

Reputation: 7117

I actually had fun with this one

What the code does:

  1. Use CTE1 to convert column value to xml nodes based on the delimiter of '-'
  2. Use CTE2 to parse out the nodes to individual rows
  3. Put row values USING ROW_NUMBER() so I can select rowid=2 and charindex>0 (looking for a/b)

enter image description here

Code:

    create table #temp (uniqID int , refidfield varchar(30));
    insert into #temp 
    values 
        (1, 'CLC-04a-CCC-ILS')
        , (2, 'CLC-04b-CCC-ILS')
        , (3, 'RX-101a-ICP')
        , (4, 'RX-101b-ICP');

    ;with CTE1 as 
        (
            select 
                uniqID
                , refidfield
                ,  cast('<A>'+ replace(refidfield,'-','</A><A>')+ '</A>' as xml) as Data
            from #temp as t
        )
    , CTE2 as 
        (
            select
                 uniqID,
                 refidfield,
                 F.x.value('.', 'nvarchar(max)') as Data,
                 CASE 
                    WHEN CHARindex('a', F.x.value('.', 'nvarchar(max)'),1)>0 then 'A GROUP'
                    WHEN CHARindex('b', F.x.value('.', 'nvarchar(max)'),1)>0 then 'B GROUP'
                    END as GRP
                ,  Row_NUmber() OVER (Partition BY uniqID ORDER BY (SELECT NULL)) AS ID
            from 
                CTE1 as c
                outer apply c.Data.nodes('/A') as F(x)
        )
    SELECT
        refidfield
         , GRP
         , LEFT(NewDisplay,LEN(NewDisplay)-1) AS NewReferenceIDField
    FROM
        (
            SELECT 
                *
                , (SELECT Data FROM cte2 y WHERE ID=1 AND y.refidfield=x.refidfield)+'-'
                    +(SELECT Data FROM cte2 y WHERE ID=2 AND y.refidfield=x.refidfield) NewDisplay
            FROM cte2 x
        ) z
    WHERE ID=2 and (CHARINDEX('b', Data,1)>0 OR CHARINDEX('a', Data,1)>0)

Upvotes: 0

Related Questions