Joe
Joe

Reputation: 55

Returning a specific combination of alpha numeric characters

I have a table that contains 1000s of rows with alpha numeric data such as this:

select id from status_table

id
--
N #AN54055Z / MATTHEW
345 IN  EH86963D  JP Verified o
C N # CV08705Z  PP CONF / A
UI 45 Ty
1548 47 CN #DW40205J  VERIFIED AF
N #-CT77535S
78 78Thi OK
584 C yu N #AV10045A/DONNA MO
AH28395U  VERIF BK MAH
yu YUyu789 56

There is a unique combination of alpha numeric characters I want to return in my results:

for example: alpha alpha numeric numeric numeric numeric numeric alpha

id
--
AN54055Z
EH86963D
CV08705Z
DW40205J
CT77535S
AV10045A
AH28395U

Upvotes: 0

Views: 710

Answers (3)

Alan Burstein
Alan Burstein

Reputation: 7918

It's worth noting that, if this is data that you need to retrieve frequently you can create a computed column one your table and even index it if you want. Note my code and comments.

IF OBJECT_ID('tempdb..#t') is not null drop table #t;

CREATE TABLE #t
(
  -- the original column
  x varchar(256), 
  -- A computed column to extract that data ahead of time
  xx as        
  case 
    when PATINDEX('%[a-zA-Z][a-zA-Z][0-9][0-9][0-9][0-9][0-9][a-zA-Z]%',x) > 0
    then substring(x,PATINDEX('%[a-zA-Z][a-zA-Z][0-9][0-9][0-9][0-9][0-9][a-zA-Z]%',x),8)
  end persisted
);

-- optional index if you search on that column frequently
-- NOTE That you indexes can speed up SELECT statements but will impact inserts/updates/deletes
CREATE NONCLUSTERED INDEX nc_t_xx ON #t(xx);

--declare @t table (x varchar(256))

insert into #t (x) values
('N #AN54055Z / MATTHEW'),
('345 IN  EH86963D  JP Verified o'),
('C N # CV08705Z  PP CONF / A'),
('UI 45 Ty'),
('1548 47 CN #DW40205J  VERIFIED AF'),
('N #-CT77535S'),
('78 78Thi OK'),
('584 C yu N #AV10045A/DONNA MO'),
('AH28395U  VERIF BK MAH'),
('yu YUyu789 56')

SELECT x, xx 
FROM #t;

Upvotes: 0

BdR
BdR

Reputation: 3048

What you're looking for is called regular expressions, also know as regex, which is a sort of macro language used to search for patterns in texts.

The following is a regular expression that matches 2 alpha characters uppercase followed by 5 numerics and then one alpha character.

[A-Z]{2}[0-9]{5}[A-Z]

You can use LIKE in a WHERE statement to match patterns, for example
WHERE MyField LIKE '%[A-Z]{2}%'. On SQL Server 2008 and up you can use PATINDEX to find the index of the first match of a regular expression.

I can't run a test SQL right now, but I think something like this might work:

SELECT
    PATINDEX ( '%([A-Z]{2}[0-9]{5}[A-Z])%', Id ) as TestPatIdx,
    CASE 
        WHEN PATINDEX ( '%([A-Z]{2}[0-9]{5}[A-Z])%', Id ) > 0 THEN
            SUBSTRING( MyField, PATINDEX ( '%([A-Z]{2}[0-9]{5}[A-Z])%' , Id ), 8 )
    END as TestResult,
    Id
FROM
    status_table

Upvotes: 0

S3S
S3S

Reputation: 25112

You can check the pattern with PATINDEX and then use this return value (the location where the pattern begins) as the start location in SUBSTRING. Since the pattern is static, so is the length for SUBSTRING

declare @t table (x varchar(256))

insert into @t values
('N #AN54055Z / MATTHEW'),
('345 IN  EH86963D  JP Verified o'),
('C N # CV08705Z  PP CONF / A'),
('UI 45 Ty'),
('1548 47 CN #DW40205J  VERIFIED AF'),
('N #-CT77535S'),
('78 78Thi OK'),
('584 C yu N #AV10045A/DONNA MO'),
('AH28395U  VERIF BK MAH'),
('yu YUyu789 56')

select d.* from(
    select 
        case 
            when PATINDEX('%[a-zA-Z][a-zA-Z][0-9][0-9][0-9][0-9][0-9][a-zA-Z]%',x) > 0
            then substring(x,PATINDEX('%[a-zA-Z][a-zA-Z][0-9][0-9][0-9][0-9][0-9][a-zA-Z]%',x),8)
            else null
        end as id
    from @t) d
where d.id is not null

Upvotes: 1

Related Questions