ZSH
ZSH

Reputation: 652

SQL - Remove select non-alphanumeric characters from a string

I'm trying to sanitize strings in my company's databases (from the early days before we had better data validation). However, the function isn't working. I figured out that it's only assigning '[' to the @Pattern string, but I'm not sure why.

Code to re-create the function (you'll need to add your DB prefix before the function name):

/* Remove select non-alphanumeric characters from a string */

CREATE FUNCTION [CleanString] (@Str VARCHAR(max))

RETURNS VARCHAR(max) AS
BEGIN

DECLARE @BadIndex SMALLINT, -- Index of bad character
        @Pattern CHAR(132)  -- Characters to look for (min length is # chars + 4 for %[]% enclosing)

SELECT @Pattern = 
'%['
    /* +CHAR(00) */ +CHAR(01) +CHAR(02) +CHAR(03) +CHAR(04) +CHAR(05) +CHAR(06) +CHAR(07) +CHAR(08) +CHAR(09) 
    +CHAR(10) +CHAR(11) +CHAR(12) +CHAR(13) +CHAR(14) +CHAR(15) +CHAR(16) +CHAR(17) +CHAR(18) +CHAR(19) 
    +CHAR(20) +CHAR(21) +CHAR(22) +CHAR(23) +CHAR(24) +CHAR(25) +CHAR(26) +CHAR(27) +CHAR(28) +CHAR(29) 
    +CHAR(30) +CHAR(31) /* +CHAR(32) */ +CHAR(33) +CHAR(34) +CHAR(35) +CHAR(36) /* +CHAR(37) */ +CHAR(38) +CHAR(39) 
    +CHAR(40) +CHAR(41) +CHAR(42) +CHAR(43) +CHAR(44) +CHAR(45) /* +CHAR(46) */ +CHAR(47) /* +CHAR(48) +CHAR(49) 
    +CHAR(50) +CHAR(51) +CHAR(52) +CHAR(53) +CHAR(54) +CHAR(55) +CHAR(56) +CHAR(57) */ +CHAR(58) +CHAR(59) 
    +CHAR(60) +CHAR(61) +CHAR(62) +CHAR(63) /* +CHAR(64) +CHAR(65) +CHAR(66) +CHAR(67) +CHAR(68) +CHAR(69) 
    +CHAR(70) +CHAR(71) +CHAR(72) +CHAR(73) +CHAR(74) +CHAR(75) +CHAR(76) +CHAR(77) +CHAR(78) +CHAR(79) 
    +CHAR(80) +CHAR(81) +CHAR(82) +CHAR(83) +CHAR(84) +CHAR(85) +CHAR(86) +CHAR(87) +CHAR(88) +CHAR(89) 
    +CHAR(90)  +CHAR(91) */ +CHAR(92) /* +CHAR(93) */ +CHAR(94) /* +CHAR(95) */ +CHAR(96) /* +CHAR(97) +CHAR(98) +CHAR(99) 
    +CHAR(100) +CHAR(101) +CHAR(102) +CHAR(103) +CHAR(104) +CHAR(105) +CHAR(106) +CHAR(107) +CHAR(108) +CHAR(109) 
    +CHAR(110) +CHAR(111) +CHAR(112) +CHAR(113) +CHAR(114) +CHAR(115) +CHAR(116) +CHAR(117) +CHAR(118) +CHAR(119) 
    +CHAR(120) +CHAR(121) +CHAR(122) */ +CHAR(123) +CHAR(124) +CHAR(125) +CHAR(126) +CHAR(127) 
+']%',

@BadIndex = PATINDEX(@Pattern, @Str) -- Find first bad character index

WHILE @BadIndex > 0
    BEGIN
        SELECT 
             @Str = STUFF(@Str, @BadIndex, 1, ' ') -- Swap bad char for a space
            ,@BadIndex = PATINDEX(@Pattern, @Str) -- Find next bad character index
    END

RETURN @Str
END

Test (you'll need to add your DB prefix before the function name):

select CleanString('woqp ienv8 90()*&@ #$! skllj_asdf') as test

Result:

'woqp ienv8 90()*&@ #$! skllj_asdf'

Thanks in advance for the help :)

Upvotes: 0

Views: 1401

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

There are two things I notice. First, @patindex should have wildcards. Second, you probably don't want to return inside the loop.

You can try logic like this:

SET @Pattern = '%' + @Pattern + '%';
SET @BadIndex = PATINDEX(@Pattern, @Str); -- Find first bad character index

WHILE @BadIndex > 0
    SELECT @Str = STUFF(@Str, @BadIndex, 1, ' '), -- Swap bad char for a space
           @BadIndex = PATINDEX(@Pattern, @Str); -- Find next bad character index
END;

RETURN @Str;

EDIT:

The problem is your declare statement. You have declared the pattern to be char() instead of varchar(). This means that the pattern is padded with spaces on the right, and the pattern never matches.

Just change this:

DECLARE @BadIndex SMALLINT, -- Index of bad character
        @Pattern CHAR(132)  -- Characters to look for (min length is # chars + 4 for %[]% enclosing)

To:

DECLARE @BadIndex SMALLINT, -- Index of bad character
        @Pattern VARCHAR(8000)  -- Characters to look for (min length is # chars + 4 for %[]% enclosing)

This time I tested it and it works.

Upvotes: 2

Related Questions