Reputation: 55
I am looking for a function that selects English numbers and letters only:
Example:
TEKA תנור ביל דין in HLB-840 P-WH לבן
I want to run a function and get the following result:
TEKA HLB-840 P-WH
I'm using MS SQL Server 2012
Upvotes: 2
Views: 154
Reputation: 55
create function dbo.AlphaNumericOnly(@string varchar(max))
returns varchar(max)
begin
While PatIndex('%[^a-z0-9]%', @string) > 0
Set @string = Stuff(@string, PatIndex('%[^a-z0-9]%', @string), 1, '')
return @string
end
Upvotes: 1
Reputation: 5782
there is no build in function for such purpose, but you can create your own function, should be something like this:
--create function (split string, and concatenate required)
CREATE FUNCTION dbo.CleanStringZZZ ( @string VARCHAR(100))
RETURNS VARCHAR(100)
BEGIN
DECLARE @B VARCHAR(100) = '';
WITH t --recursive part to create sequence 1,2,3... but will better to use existing table with index
AS
(
SELECT n = 1
UNION ALL
SELECT n = n+1 --
FROM t
WHERE n <= LEN(@string)
)
SELECT @B = @B+SUBSTRING(@string, t.n, 1)
FROM t
WHERE SUBSTRING(@string, t.n, 1) != '?' --this is just an example...
--WHERE ASCII(SUBSTRING(@string, t.n, 1)) BETWEEN 32 AND 127 --you can use something like this
ORDER BY t.n;
RETURN @B;
END;
and then you can use this function in your select statement:
SELECT dbo.CleanStringZZZ('TEKA תנור ביל דין in HLB-840 P-WH לבן');
Upvotes: 1
Reputation: 67291
SQL-Server is not the right tool for this.
The following might work for you, but there is no guarantee:
declare @yourString NVARCHAR(MAX)=N'TEKA תנור ביל דין in HLB-840 P-WH לבן';
SELECT REPLACE(REPLACE(REPLACE(REPLACE(CAST(@yourString AS VARCHAR(MAX)),'?',''),' ','|~'),'~|',''),'|~',' ');
The idea in short:
A cast of NVARCHAR
to VARCHAR
will return all characters in your string, which are not known in the given collation, as question marks. The rest is replacements of question marks and multi-blanks.
If your string can include a questionmark, you can replace it first to a non-used character, which you re-replace at the end.
If you string might include either |
or ~
you should use other characters for the replacements of multi-blanks.
You can influence this approach by specifying a specific collation, if some characters pass by...
Upvotes: 1
Reputation: 520978
What you really need here is regex replacement, which SQL Server does not support. Broadly speaking, you would want to find [^A-Za-z0-9 -]+\s*
and then replace with empty string. Here is a demo showing that this works as expected:
This would output TEKA in HLB-840 P-WH
for the input you provided. You might be able to do this in SQL Server using a regex package or UDF. Or, you could do this replacement outside of SQL using any number of tools which support regex (e.g. C#).
Upvotes: 2