Reputation: 383
It's been a while...I'd like to get your advice on the most efficient way to join on only the number part of a field that may be prefixed and/or suffixed with up to 2 letters. Here's a simplified snippet of what I'm trying to do:
SELECT a, b, c
FROM table 1 t1
LEFT JOIN table 2 t2 ON t1.PolicyCode = t2.sPolicyID,
Where t2.sPolicyID
could begin and/or end with up to 2 letters. Some examples: TG73100
, S7286674
, 2344506R
, etc. We only want to join to just its numeric part in between the letters, i.e. 73100
, 7286674
or 2344506
from the examples.
Could someone please advise on a simple way of doing this?
Upvotes: 1
Views: 1144
Reputation: 10277
Here is one way:
LEFT JOIN table 2 t2 ON t1.PolicyCode =
LEFT(SUBSTRING(t2.sPolicyID, PATINDEX('%[0-9]%', t2.sPolicyID), 50),
PATINDEX('%[^0-9]%',
SUBSTRING(t2.sPolicyID, PATINDEX('%[0-9]%', t2.sPolicyID), 50)
+ 'a') -1)
To break this down, there are 4 main parts.
1: Find the position of the first number with PATINDEX
:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[0-9]%', @spolicyID)
--Returns 3
2: Use SUBSTRING()
to cut off everything before the first letter:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
SELECT SUBSTRING(@spolicyID, PATINDEX('%[0-9]%', @spolicyID), 50)
--Returns 123123xx
If we hardcoded the 3
that we know is returned from the first part, it would look like this:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
SELECT SUBSTRING(@spolicyID, 3), 50)
--50 is the number of characters to extract, set to something
--higher than the max string length to be safe
Of course, we don't want to hardcode it since it can change, but that makes seeing the different functions a bit easier.
3: Find the position of the next letter using PATINDEX
again:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[^0-9]%', SUBSTRING(@spolicyID, PATINDEX('%[0-9]%', @spolicyID), 50) + 'a')
--Returns 7 since it is looking at 123123xx
--The first x is in the 7th position
Note that we added an a
onto the string. This is because if we had a string with no letters at the end, it would throw an error as the length 0 would be returned to SUBSTRING
. You could add any letter or letters to the end and it would work, we are just making sure there is at least one. Try removing the + 'a'
and using a string like xx123123
to see the error.
If we hardcoded the 123123xx
from step 2 it would look like this (again just for easy example):
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
SELECT PATINDEX('%[^0-9]%', '123123xx' + 'a')
4: Use LEFT()
to return everything before the trailing letters, leaving us with only the numbers in between:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
LEFT(SUBSTRING(@spolicyID, PATINDEX('%[0-9]%', @spolicyID), 50),PATINDEX('%[^0-9]%', SUBSTRING(@spolicyID, PATINDEX('%[0-9]%', @spolicyID), 50) + 'a') -1)
--Need to add `-1` because step 3 PATINDEX returns 7
--as the position of first trailing letter, and
--we want the 6 characters before that
And again hardcoded from step 2 and 3 for easy viewing:
DECLARE @spolicyID VARCHAR(20) = 'xx123123xx'
LEFT('123123xx', 7-1)
Upvotes: 3