Reputation: 2337
I need a function to return the longest sequence of digits in a string, for example:
P0123/99282 returns 99282,
P9-123BB-12339 returns 12339,
12345/54321 returns 12345 (should return first instance when length is the same).
I have developed this but this is very slow, I wonder if there is something faster than this:
DECLARE @str NVARCHAR(40) = N'P0120993/123-AB1239'
DECLARE @x XML
;WITH e1(n) AS(SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN (SELECT 1 as t UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1) AS b),
n(Number) AS(SELECT n = ROW_NUMBER() OVER (ORDER BY n) FROM e2)
SELECT @x = CAST(N'<A>'+ REPLACE((SELECT CAST(CAST((
SELECT
CASE WHEN SUBSTRING(@str, Number, 1) like N'[^0-9]' THEN N' ' ELSE SUBSTRING(@str, Number, 1) end
FROM n
WHERE Number <= LEN(@str) FOR XML Path(''))
AS xml) AS nvarchar(max))),N' ',N'</A><A>')+ N'</A>' AS XML)
SELECT TOP 1
case when t.value('.', 'nvarchar(max)') = N'' then null else t.value('.', 'nvarchar(max)') end AS inVal
FROM
@x.nodes('/A') AS x(t)
ORDER BY
LEN(t.value('.', 'nvarchar(max)')) DESC;
EXPLANATION:
The max length of the string I will pass is 40, and what I do is to generate a sequence of numbers from one to forty, extract the Nth character from the string where N is the sequence value but if the character is not a digit then I replace with a white space, then I return the XML as string enlcosing with <A>XXX</A>
to then convert to xml and then query that and return the first item order by it's length desc.
thanks,
Upvotes: 1
Views: 68
Reputation: 62831
While I'm not 100% sure how much better this would be with performance, here is an approach that breaks down the strings into any potential numeric combination and returns the first with the longest length:
DECLARE @foo TABLE(ID varchar(40));
INSERT @foo VALUES('P0123/99282'),('P9-123BB-12339'),('12345/54321');
;WITH NumbersTable AS
(
SELECT TOP (40) n = ROW_NUMBER() OVER (ORDER BY Number)
FROM master.dbo.spt_values
ORDER BY Number
), Results AS
(
SELECT f.Id, SUBSTRING(f.ID, t1.n, t2.n) numericvalues,
row_number() over (partition by f.Id
order by LEN(SUBSTRING(f.ID, t1.n, t2.n)) desc) rn
FROM NumbersTable t1
INNER JOIN @foo AS f
ON t1.n <= LEN(f.ID)
INNER JOIN NumbersTable t2
ON t2.n <= LEN(f.ID)
WHERE SUBSTRING(f.ID, t1.n, t2.n) NOT LIKE '%[^0-9]%'
)
SELECT *
FROM Results
WHERE rn = 1
This creates a numbers table from 1 to 40 (since that was your max length), and then using joins
creates every substring
variation of the data that have a numeric value using NOT LIKE '%[^0-9]%'
, and then establishes a row_number
based on the len
of that substring
.
Upvotes: 1