Reputation: 33
I am importing working with data imported from excel files. There is a column with a string that can contain multiple numbers. I am trying to extract the largest number in the string or a 0 if there is no string. The strings are in formats similar to: "100% post-consumer recycled paper, 50% post-consumer recycled cover, 90% post-consumer recycled wire." "Paper contains 30% post-consumer content." or sometimes a empty string or null.
Given the irregular formatting of the string I am having trouble and any help would be appreciated.
Upvotes: 2
Views: 1159
Reputation: 33
I ended up creating a function that handled it. Here is the code:
CREATE FUNCTION [dbo].[cal_GetMaxPercentFromString]
RETURNS float
AS
BEGIN
declare @Numbers Table(number float)
insert into @Numbers
Select 0
declare @temp as varchar(2000) = @string
declare @position int, @length int, @offset int
WHILE CHARINDEX('%', @temp) > 0
BEGIN
set @position = CHARINDEX('%', @temp)
set @offset = 1
set @length = -1
WHILE @position - @offset > 0 and @length < 0
BEGIN
if SUBSTRING(@temp, @position - @offset, 1) not LIKE '[0-9]'
set @length = @offset - 1
set @offset = @offset + 1
END
if @length > 0
BEGIN
insert into @Numbers
select CAST(SUBSTRING(@temp, @position - @length, @length) as float)
END
set @temp = SUBSTRING(@temp, 1, @position - 1) + SUBSTRING(@temp, @position + 1, LEN(@temp) - @position)
END
declare @return as float
select @return = MAX(number) from @Numbers
return @return
END
Upvotes: 0
Reputation: 10908
Generate the LEN(txt)
possible RIGHT()
fragments of txt
. Trim each fragment at the first non-digit character. Test if the remainder is an int
. Return the MAX()
.
SELECT
txt
,MAX(TRY_CONVERT(int,LEFT(RIGHT(txt,i),PATINDEX('%[^0-9]%',RIGHT(txt,i)+' ')-1)))
FROM MyTable
CROSS APPLY (
SELECT TOP(LEN(txt)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) i FROM master.dbo.spt_values a, master.dbo.spt_values b
) x
GROUP BY txt
Upvotes: 0
Reputation: 121
Here's a scalar function that will take a string as an input and return the largest whole number it finds (up to a maximum of 3 digits, but from your question I've assumed you're dealing with percentages. If you need more digits, repeat the IF statements ad infinitum).
Paste this into SSMS and run it to create the function. To call it, do something like:
SELECT dbo.GetLargestNumberFromString(MyStringField) as [Largest Number in String]
FROM MyMessedUpData
Function:
CREATE FUNCTION GetLargestNumberFromString
(
@s varchar(max)
)
RETURNS int
AS
BEGIN
DECLARE @LargestNumber int, @i int
SET @i = 1
SET @LargestNumber = 0
WHILE @i <= LEN(@s)
BEGIN
IF SUBSTRING(@s, @i, 3) like '[0-9][0-9][0-9]'
BEGIN
IF CAST(SUBSTRING(@s, @i,3) as int) > @LargestNumber OR @LargestNumber IS NULL
SET @LargestNumber = CAST(SUBSTRING(@s, @i,3) as int);
END
IF SUBSTRING(@s, @i, 2) like '[0-9][0-9]'
BEGIN
IF CAST(SUBSTRING(@s, @i,2) as int) > @LargestNumber OR @LargestNumber IS NULL
SET @LargestNumber = CAST(SUBSTRING(@s, @i,2) as int);
END
IF SUBSTRING(@s, @i, 1) like '[0-9]' OR @LargestNumber IS NULL
BEGIN
IF CAST(SUBSTRING(@s, @i,1) as int) > @LargestNumber
SET @LargestNumber = CAST(SUBSTRING(@s, @i,1) as int);
END
SET @i = @i + 1
CONTINUE
END
RETURN @LargestNumber
END
Upvotes: 1
Reputation: 12804
As far as determining the largest size, I think you need to look at your data set first, but the update could be as simple as:
DECLARE @COUNTER INT=1000
While EXISTS (SELECT * FROM <Table> WHERE NewColumn is NULL) AND @COUNTER>=0
BEGIN
UPDATE <Table> SET NewColumn=@COUNTER WHERE <SearchColumn> LIKE '%' + CONVERT(VARCHAR,@COUNTER) + '%' AND NewColumn is NULL
SET @COUNTER=@COUNTER-1
END
Upvotes: 1