Thor
Thor

Reputation: 10068

Specified "COLLATE Latin1_General_CS_AS", query result still "case insensitive"

I want to only selects rows from a table where the last name of the employees stands with a lowercase letter. However, when I run the query below

SELECT empid, lastname
 FROM HR.Employees
 WHERE lastname COLLATE Latin1_General_CS_AS LIKE '[a-z]%';

it returned all rows where the employees' last name starts with either lowercase letter or uppercase letter.

I have already specified that the query should be case sensitive with COLLATE Latin1_General_CS_AS, how come it is not filtering out last name that starts with capital letter?

Upvotes: 3

Views: 24336

Answers (3)

Dan Guzman
Dan Guzman

Reputation: 46425

According the SQL Server Books Online, the characters included in range searches depend on the sorting rules of the collation. Collation Latin1_General_CS_AS uses dictionary order so both upper and lower case characters of the specified range are included.

Specify a binary collation to get the behavior you want (the 26 ASCII characters in the 97-122 code point range):

SELECT empid, lastname
FROM HR.Employees
WHERE lastname COLLATE Latin1_General_BIN LIKE '[a-z]%';

Note that the explicit collation results in a non-sargable expression so a full table/index scan will be required even if an index on lastname exists. Although it probably won't help here since most (if not all) last names start with a letter, one can generally add a sargable expression for the case-insensitive superset to facilitate an index seek. If the index is used, rows matching the seek predicate will be additionally filtered by the case-sensitive predicate.

SELECT empid, lastname
FROM HR.Employees
WHERE lastname COLLATE Latin1_General_BIN LIKE '[a-z]%'
    AND lastname LIKE '[a-z]%';

Upvotes: 7

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67341

Another approach might be to look at the ASCII-Code of the first character:

DECLARE @tbl TABLE(Name VARCHAR(100));
INSERT INTO @tbl VALUES('Adam'),('adam')
                      ,('John'),('john')
                      ,('Jane'),('jane')
                      ,('Zaccharias'),('zaccharias');
SELECT * 
FROM @tbl
WHERE ASCII(LEFT(Name,1)) BETWEEN ASCII('a') AND ASCII('z')

Upvotes: 1

Tony Dong
Tony Dong

Reputation: 3313

[a-c] will included a, A, b, B, c, so give each letter you want in this range like bellow will works.

select empid, lastname
from HR.Employees
where LastName LIKE '[a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]%' COLLATE Latin1_General_CS_AS;

Upvotes: 3

Related Questions