vikjam
vikjam

Reputation: 550

Translating regex into LIKE statements for SQL Server

I have a regular expression that I have been using in Postgres that I want to translate to SQL Server, but I believe SQL Server lacks the full capabilities of using regular expressions. However, I think my regular expression should be simple enough to translate into LIKE statements.

This is my regular expression: ^[123ABC]*([456]). From my understanding of regular expressions, the two features that I'm using here are the ability to match zero-or-more characters at the beginning of the string and capture groups.

Here's an example using Postgres:

CREATE TABLE inventory (id INT, pcode VARCHAR(10));
INSERT INTO inventory VALUES (1, 'AABC547');
INSERT INTO inventory VALUES (2, '656ABC49');
INSERT INTO inventory VALUES (3, '11AB727');
INSERT INTO inventory VALUES (4, '7712346');

SELECT
    id,
    pcode,
    COALESCE(
        SUBSTRING(pcode from '^[123ABC]*([456])'):: int, 0
    ) AS desired_val
FROM inventory;

-- id   pcode    desired_val
-- 1    AABC547  5
-- 2    656ABC49 6
-- 3    11AB727  0
-- 4    7712346  0

Here's my attempt to try to rewrite this for use with SQL Server which gets the first row wrong because I am not getting the first value that matches my desired pattern [456], but the last one.

SELECT
    id,
    pcode,
    CASE
        WHEN pcode LIKE '[456]%' THEN CAST(SUBSTRING(pcode, 1, 1) AS INT)
        WHEN pcode LIKE '[123ABC]%4%' THEN 4
        WHEN pcode LIKE '[123ABC]%5%' THEN 5
        WHEN pcode LIKE '[123ABC]%6%' THEN 6
        ELSE 0
    END AS desired_val
FROM inventory;

-- id   pcode    desired_val
-- 1    AABC547  4
-- 2    656ABC49 6
-- 3    11AB727  0
-- 4    7712346  0

Upvotes: 1

Views: 659

Answers (2)

Gert-Jan
Gert-Jan

Reputation: 347

alternative solution:

create table #inventory(id int identity,pcode varchar(10))
insert into #inventory values ('AABC547')
insert into #inventory values ('656ABC49')
insert into #inventory values ('11AB727')
insert into #inventory values ('7712346')

SELECT
  id,
  pcode,
  CASE
    WHEN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(pcode,'1',''),'2',''),'3',''),'A',''),'B',''),'C','') LIKE '[456]%'
    THEN CAST(SUBSTRING(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(pcode,'1',''),'2',''),'3',''),'A',''),'B',''),'C',''), 1, 1) AS INT)
    ELSE 0
  END AS desired_val
FROM #inventory;

drop table #inventory

This solution removes the allowed leading characters from the pcode. These REPLACEs will remove all occurrences. So this only works if there is no overlap between these leading characters ("123ABC") and the characters you are searching ("456").

Upvotes: 0

41686d6564
41686d6564

Reputation: 19641

If you really have to do this with pure T-SQL, you may write something like this:

SELECT
  id,
  pcode,
  CASE
    WHEN pcode LIKE '[456]%' THEN CAST(SUBSTRING(pcode, 1, 1) AS INT)
    WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('4', pcode) - 1) + '4%' THEN 4
    WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('5', pcode) - 1) + '5%' THEN 5
    WHEN pcode LIKE REPLICATE('[123ABC]', CHARINDEX('6', pcode) - 1) + '6%' THEN 6
    ELSE 0
  END AS desired_val
FROM inventory;

The trick is to use CHARINDEX() to get the position of the first occurrence of a [456] char, and then use REPLICATE() to repeat the [123ABC] pattern a number of times equal to the number of characters that precede the desired character. So, in the first example, where the '5' char is found at the fifth position, the constructed pattern will be:

[123ABC][123ABC][123ABC][123ABC]5%

Upvotes: 2

Related Questions