George Johnston
George Johnston

Reputation: 32258

TSQL Replace all non a-z/A-Z characters with an empty string

I would like to take a field and replace all characters that are not between a-z and A-Z with "".

Is this possible, and if so, how?

Upvotes: 8

Views: 24847

Answers (3)

Stephan
Stephan

Reputation: 6018

With the introduction of the TRANSLATE() function in SQL Server 2017, this is now easy to do without any slow looping UDF's or complex CLR's

Fast Native SQL Server 2017+ Solution

SELECT *
FROM (VALUES ('Hello th!s is a te$t')
            ,('An$#oth!:er test 1234567890')
    ) AS A(String)
/*First find all non a-z characters by removing all a-z characters*/
CROSS APPLY (SELECT nonAZChars = REPLACE(TRANSLATE(String,'abcdefghijklmnopqrstuvwxyz',REPLICATE(' ',26)),' ','')) AS B
/*Translate non a-z characters to random character "|" and then remove all "|"*/
CROSS APPLY (SELECT CleanString = REPLACE(TRANSLATE(String,nonAZChars,REPLICATE('|',LEN(nonAZChars))),'|','')) AS C

Results

String nonAZChars CleanString
Hello th!s is a te$t !$ Hello ths is a tet
An$#oth!:er test 1234567890 $#!:1234567890 Another test

Upvotes: 0

dcp
dcp

Reputation: 55459

You could create a CLR stored procedure to do the regular expression replacement. Here's an article on that topic: http://weblogs.sqlteam.com/jeffs/archive/2007/04/27/SQL-2005-Regular-Expression-Replace.aspx

Then you could do something like this:

UPDATE your_table
SET col1 = dbo.RegExReplace(col1, '[^A-Za-z]','');

EDIT: Since CLR isn't an option, check out this link, there is a dbo.RegexReplace function there which is written in t-sql, not CLR. You could use that function in the following manner:

First, you need to run this to enable Ole:

sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'Ole Automation Procedures', 1;
GO
RECONFIGURE;
GO

Then create the dbo.RegexReplace function given at the link I provided.

Then you can do this:

create table your_table(col1 varchar(500))
go

insert into your_table values ('aBCCa1234!!fAkk9943');

update your_table set col1 = dbo.RegexReplace('[^A-Za-z]','',col1,1,1);

select * from your_table

Result:
aBCCafAkk

Upvotes: 8

Goyuix
Goyuix

Reputation: 24360

You could try creating a UDF (user defined function) and then use it in your queries:

Then do a query similar to:

SELECT * FROM myTable WHERE find_regular_expression(myCol, '[^a-zA-Z]')

It also appears the there may be more native support in later versions of SQL Server, certainly 2008 R2, through the mdq.RegexMatches function (Part of Master Data Services).

http://msdn.microsoft.com/en-us/library/ee633829(SQL.105).aspx

Upvotes: 0

Related Questions