NealWalters
NealWalters

Reputation: 18227

T-SQL strip all non-alpha and non-numeric characters

Is there a smarter way to remove all special characters rather than having a series of about 15 nested replace statements?

The following works, but only handles three characters (ampersand, blank and period).

select CustomerID, CustomerName, 
   Replace(Replace(Replace(CustomerName,'&',''),' ',''),'.','') as CustomerNameStripped
from Customer 

Upvotes: 10

Views: 44362

Answers (6)

Markusbond
Markusbond

Reputation: 1

SQL Server 2017 and higher offers the translate function. It requires three arguments TRANSLATE( inputString, characters, translations )

There is a one-to-one relationship between the character list and the translation list. There are 30 possible characters to replace, so I replicated 30 spaces (instead of typing them). But since I don't want my return variable to contain spaces I wrapped the whole translate function in a replace function to remove them.

I should also point out that I declared @var to be varchar(8000) which is the largest possible value before moving to a varchar(max), which you should avoid unless you really need it. I would recommend you consider carefully the maximum size of the data you would be passing into this function and size the @var appropriately. I believe you could also use the nvarchar datatype.

Once you create the function, the way you would call it might look like the following two statements.

select dbo.AlphaNumericOnly('*&!Blah') as [no symbols] The result would be just "Blah".

select col1, dbo.AlphaNumericOnly(col1) as [col1 no symbols] from someTable as t The result would by a two column result set. The first column would be col1 unaltered, and the second column would be col1 without the special characters.

I hope this helps.

create function AlphaNumericOnly
(
    @var varchar(8000)
)
returns varchar(8000)
begin

    return replace(translate(@var, '`~!@#$%^&*()_+-={}|[]\:;<,>.?/', replicate(' ', 30)), ' ', '')
end

Upvotes: 0

Alex K.
Alex K.

Reputation: 175976

One flexible-ish way;

CREATE FUNCTION [dbo].[fnRemovePatternFromString](@BUFFER VARCHAR(MAX), @PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
BEGIN
    DECLARE @POS INT = PATINDEX(@PATTERN, @BUFFER)
    WHILE @POS > 0 BEGIN
        SET @BUFFER = STUFF(@BUFFER, @POS, 1, '')
        SET @POS = PATINDEX(@PATTERN, @BUFFER)
    END
    RETURN @BUFFER
END

select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')

(No column name)
cake  beer 399c

Upvotes: 19

Dylan Westphal
Dylan Westphal

Reputation: 31

I know this is an old thread, but still, might be handy for others. Here's a quick and dirty (Which I've done inversely - stripping out non-numerics) - using a recursive CTE. What makes this one nice for me is that it's an inline function - so gets around the nasty RBAR effect of the usual scalar and table-valued functions. Adjust your filter as needs be to include or exclude whatever char types.

        Create Function fncV1_iStripAlphasFromData (
            @iString Varchar(max)
        )
        Returns 
        Table With Schemabinding
        As

            Return(

                with RawData as
                (
                    Select @iString as iString
                )
                ,
                Anchor as
                (

                    Select Case(IsNumeric (substring(iString, 1, 1))) when 1 then substring(iString, 1, 1) else '' End as oString, 2 as CharPos from RawData
                    UNION ALL
                    Select a.oString + Case(IsNumeric (substring(@iString, a.CharPos, 1))) when 1 then substring(@iString, a.CharPos, 1) else '' End, a.CharPos + 1
                    from RawData r
                    Inner Join Anchor a on a.CharPos <= len(rtrim(ltrim(@iString)))

                )

                Select top 1 oString from Anchor order by CharPos Desc

            )

Go

select * from dbo.fncV1_iStripAlphasFromData ('00000')
select * from dbo.fncV1_iStripAlphasFromData ('00A00')
select * from dbo.fncV1_iStripAlphasFromData ('12345ABC6789!&*0')

Upvotes: 2

Aaron Bertrand
Aaron Bertrand

Reputation: 280625

Create a function:

CREATE FUNCTION dbo.StripNonAlphaNumerics
(
  @s VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
  DECLARE @p INT = 1, @n VARCHAR(255) = '';
  WHILE @p <= LEN(@s)
  BEGIN
    IF SUBSTRING(@s, @p, 1) LIKE '[A-Za-z0-9]'
    BEGIN
      SET @n += SUBSTRING(@s, @p, 1);
    END 
    SET @p += 1;
  END
  RETURN(@n);
END
GO

Then:

SELECT Result = dbo.StripNonAlphaNumerics
('My Customer''s dog & #1 friend are dope, yo!');

Results:

Result
------
MyCustomersdog1friendaredopeyo

To make it more flexible, you could pass in the pattern you want to allow:

CREATE FUNCTION dbo.StripNonAlphaNumerics
(
  @s VARCHAR(255),
  @pattern VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
  DECLARE @p INT = 1, @n VARCHAR(255) = '';
  WHILE @p <= LEN(@s)
  BEGIN
    IF SUBSTRING(@s, @p, 1) LIKE @pattern
    BEGIN
      SET @n += SUBSTRING(@s, @p, 1);
    END 
    SET @p += 1;
  END
  RETURN(@n);
END
GO

Then:

SELECT r = dbo.StripNonAlphaNumerics
('Bob''s dog & #1 friend are dope, yo!', '[A-Za-z0-9]');

Results:

r
------
Bobsdog1friendaredopeyo

Upvotes: 10

datagod
datagod

Reputation: 1051

I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:

IF (object_id('dbo.fn_CleanString') IS NOT NULL)
BEGIN
  PRINT 'Dropping: dbo.fn_CleanString'
  DROP function dbo.fn_CleanString
END
GO
PRINT 'Creating: dbo.fn_CleanString'
GO
CREATE FUNCTION dbo.fn_CleanString 
(
  @string varchar(8000)
) 
returns varchar(8000)
AS
BEGIN
---------------------------------------------------------------------------------------------------
-- Title:        CleanString
-- Date Created: March 26, 2011
-- Author:       William McEvoy
--               
-- Description:  This function removes special ascii characters from a string.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


declare @char        char(1),
        @len         int,
        @count       int,
        @newstring   varchar(8000),
        @replacement char(1)

select  @count       = 1,
        @len         = 0,
        @newstring   = '',
        @replacement = ' '



---------------------------------------------------------------------------------------------------
-- M A I N   P R O C E S S I N G
---------------------------------------------------------------------------------------------------


-- Remove Backspace characters
select @string = replace(@string,char(8),@replacement)

-- Remove Tabs
select @string = replace(@string,char(9),@replacement)

-- Remove line feed
select @string = replace(@string,char(10),@replacement)

-- Remove carriage return
select @string = replace(@string,char(13),@replacement)


-- Condense multiple spaces into a single space
-- This works by changing all double spaces to be OX where O = a space, and X = a special character
-- then all occurrences of XO are changed to O,
-- then all occurrences of X  are changed to nothing, leaving just the O which is actually a single space
select @string = replace(replace(replace(ltrim(rtrim(@string)),'  ', ' ' + char(7)),char(7)+' ',''),char(7),'')


--  Parse each character, remove non alpha-numeric

select @len = len(@string)

WHILE (@count <= @len)
BEGIN

  -- Examine the character
  select @char = substring(@string,@count,1)


  IF (@char like '[a-z]') or (@char like '[A-Z]') or (@char like '[0-9]')
    select @newstring = @newstring + @char
  ELSE
    select @newstring = @newstring + @replacement

  select @count = @count + 1

END


return @newstring
END

GO
IF (object_id('dbo.fn_CleanString') IS NOT NULL)
  PRINT 'Function created.'
ELSE
  PRINT 'Function NOT created.'
GO

Upvotes: 6

Oded
Oded

Reputation: 499382

If you can use SQL CLR you can use .NET regular expressions for this.

There is a third party (free) package that includes this and more - SQL Sharp .

Upvotes: 1

Related Questions