Travis Cox
Travis Cox

Reputation: 33

Extract largest number from a string in T-SQL

I am importing working with data imported from excel files. There is a column with a string that can contain multiple numbers. I am trying to extract the largest number in the string or a 0 if there is no string. The strings are in formats similar to: "100% post-consumer recycled paper, 50% post-consumer recycled cover, 90% post-consumer recycled wire." "Paper contains 30% post-consumer content." or sometimes a empty string or null.

Given the irregular formatting of the string I am having trouble and any help would be appreciated.

Upvotes: 2

Views: 1159

Answers (4)

Travis Cox
Travis Cox

Reputation: 33

I ended up creating a function that handled it. Here is the code:

CREATE FUNCTION [dbo].[cal_GetMaxPercentFromString] 

RETURNS float
AS
BEGIN
    declare @Numbers Table(number float)
    insert into @Numbers
    Select 0

    declare @temp as varchar(2000) = @string
    declare @position int, @length int, @offset int

    WHILE CHARINDEX('%', @temp) > 0
    BEGIN
        set @position = CHARINDEX('%', @temp)
        set @offset = 1
        set @length = -1
        WHILE @position - @offset > 0 and @length < 0
        BEGIN
            if SUBSTRING(@temp, @position - @offset, 1) not LIKE '[0-9]'
                set @length = @offset - 1    
            set @offset = @offset + 1
        END

        if @length > 0
        BEGIN
            insert into @Numbers
            select CAST(SUBSTRING(@temp, @position - @length, @length) as float)            
        END
        set @temp = SUBSTRING(@temp, 1, @position - 1) + SUBSTRING(@temp, @position + 1, LEN(@temp) - @position)
    END

    declare @return as float
    select @return = MAX(number) from @Numbers
    return @return
END

Upvotes: 0

Anon
Anon

Reputation: 10908

SQL Fiddle Demo

Generate the LEN(txt) possible RIGHT() fragments of txt. Trim each fragment at the first non-digit character. Test if the remainder is an int. Return the MAX().

SELECT
  txt
 ,MAX(TRY_CONVERT(int,LEFT(RIGHT(txt,i),PATINDEX('%[^0-9]%',RIGHT(txt,i)+' ')-1)))
FROM MyTable
CROSS APPLY (
  SELECT TOP(LEN(txt)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) i FROM master.dbo.spt_values a, master.dbo.spt_values b
) x
GROUP BY txt

Upvotes: 0

TehJake
TehJake

Reputation: 121

Here's a scalar function that will take a string as an input and return the largest whole number it finds (up to a maximum of 3 digits, but from your question I've assumed you're dealing with percentages. If you need more digits, repeat the IF statements ad infinitum).

Paste this into SSMS and run it to create the function. To call it, do something like:

SELECT dbo.GetLargestNumberFromString(MyStringField) as [Largest Number in String]
FROM MyMessedUpData

Function:

CREATE FUNCTION GetLargestNumberFromString
(
@s varchar(max)
) 
RETURNS int
AS
BEGIN

    DECLARE @LargestNumber int, @i int
    SET @i = 1
    SET @LargestNumber = 0

    WHILE @i <= LEN(@s)
    BEGIN

    IF SUBSTRING(@s, @i, 3) like '[0-9][0-9][0-9]'
        BEGIN
            IF  CAST(SUBSTRING(@s, @i,3) as int) > @LargestNumber OR @LargestNumber IS NULL
            SET @LargestNumber = CAST(SUBSTRING(@s, @i,3) as int);
        END

    IF SUBSTRING(@s, @i, 2) like '[0-9][0-9]'
        BEGIN
            IF  CAST(SUBSTRING(@s, @i,2) as int) > @LargestNumber OR @LargestNumber IS NULL
            SET @LargestNumber = CAST(SUBSTRING(@s, @i,2) as int);
        END

    IF SUBSTRING(@s, @i, 1) like '[0-9]' OR @LargestNumber IS NULL
        BEGIN
            IF  CAST(SUBSTRING(@s, @i,1) as int) > @LargestNumber
            SET @LargestNumber = CAST(SUBSTRING(@s, @i,1) as int);
        END

    SET @i = @i + 1


    CONTINUE
    END
    RETURN @LargestNumber
END

Upvotes: 1

UnhandledExcepSean
UnhandledExcepSean

Reputation: 12804

  1. Pull the data into SQL as-is
  2. Write a query to get a distinct list of options in that column
  3. Add a new column to store the desired value
  4. Write an update statement to populate the new column

As far as determining the largest size, I think you need to look at your data set first, but the update could be as simple as:

DECLARE @COUNTER INT=1000
While EXISTS (SELECT * FROM <Table> WHERE NewColumn is NULL) AND @COUNTER>=0
BEGIN
   UPDATE <Table> SET NewColumn=@COUNTER WHERE <SearchColumn> LIKE '%' + CONVERT(VARCHAR,@COUNTER) + '%' AND NewColumn is NULL

   SET @COUNTER=@COUNTER-1
END

Upvotes: 1

Related Questions