LCJ
LCJ

Reputation: 22661

SQL Query For Non-Alphabet name field

I have a Name column in employee table . I want to see all employee names which have a character other than alphabet. What is the best query for this?

Another Requirement: Return all names that has not even a single alphabet.

Note: Anything other English alphabet is not considered as alphabet, in this scenario. The text 'André Müller' has some non-alphabetic charaters in my scenario.

DECLARE @Employee TABLE (EmpID INT, EmpName VARCHAR(1000))
INSERT INTO @Employee (EmpID, EmpName) VALUES(1,'André Müller')
INSERT INTO @Employee (EmpID, EmpName) VALUES(2,'Lijo')
INSERT INTO @Employee (EmpID, EmpName) VALUES(3,'88')
INSERT INTO @Employee (EmpID, EmpName) VALUES(4,'--@#')
INSERT INTO @Employee (EmpID, EmpName) VALUES(5,'é ü')

Both Aproach 1 and Approach 2 is working (thanks to the people who posted the answers). Which is the better of these and why?

--Aproach 1
SELECT * FROM @Employee WHERE EmpName COLLATE Latin1_General_CI_AS LIKE '%[^a-z]%'

--Approach 2
SELECT * FROM @Employee WHERE EmpName LIKE '%[^a-zA-Z]%'

Thanks

Lijo

Upvotes: 1

Views: 6199

Answers (6)

Mich28
Mich28

Reputation: 519

!! Happy 2021:-) Be careful with all the above solutions - the test data in the example has whitespace which will make the test fail. TEST TEST TEST TEST TEST this first - it depends on your locales and defaults; for the US, this will work to find all accented chars (in any position (on the border or in middle, only accents or mix with ABC, spaces in between, etc...). Some people missing aspects of LIKE range negation. The column in question should be defined as NVARCHAR - even doing CAST on the fly won't work, so the first step is needed. We then convert back to VARCHAR with the right collation:

SELECT SERVERPROPERTY('Collation')  DefaultCol, *, cast(nA as varchar(32)) collate SQL_Latin1_General_Cp1251_CS_AS nAX
FROM (
  SELECT EmpName ,CAST(EmpName AS NVARCHAR(200)) nA FROM  Employee  --#1 step
     )X
WHERE cast(nA as varchar(32)) collate SQL_Latin1_General_Cp1251_CS_AS  <>  nA

Upvotes: 1

gbn
gbn

Reputation: 432662

You need to use LIKE but with COLLATE (for SQL Server) to ignore accents

WHERE
   EmployeeName COLLATE Latin1_General_CI_AI LIKE '%[^a-z]%

"André Müller" is a valid name but would be "false" because neither é or ü isn't in the range a-z

Upvotes: 5

Mikael Eriksson
Mikael Eriksson

Reputation: 139010

You can use patindex and include whatever characters you have in your alphabet. Here with some Swedish characters and a space.

select *
from Emp
where patindex('%[^A-ZÅÄÖ a-zåäö]%', Name) > 0

Edit:

Thanks to comment by @gbn I now know that you can use collate to simplify the expression. CI takes care of case and AI takes care of accents.

where patindex('%[^A-Z ]%', Name COLLATE Latin1_General_CI_AI) > 0

Upvotes: 0

Matt Gibson
Matt Gibson

Reputation: 38238

It depends on what you mean by "alphabet", really. In the simple case, this will list all rows in employee where the name column contains a character other than "A-Z":

SELECT 
   * 
FROM 
   employee
WHERE 
   name LIKE '%[^A-Z]%'

You can expand this as necessary, e.g. LIKE '%[^ A-Z]% will check for characters other than A-Z and a space. To cope with other alphabets, including accented characters, etc., you can either include them or look into using collation to massage things as necessary.

For more information, see the details on pattern matching in the LIKE clause help for T-SQL.

Upvotes: 0

Jandrejc
Jandrejc

Reputation: 499

Don't know if i understood your question correctly, but i think this might be a solution:

SELECT *
FROM employee
WHERE name LIKE '%[^a-zA-Z]%'

Upvotes: 2

Prashant Lakhlani
Prashant Lakhlani

Reputation: 5806

You can use Ascii and Left functions of t-sql to do that. Just find left 1 character and use ascii values to check if they are not between range of small and capital latter ascii values.

Upvotes: 0

Related Questions