Lee White
Lee White

Reputation: 3709

String comparison that ignores special unicode charaters

I have a table with city names in various languages. These names often contain characters such as é, ü etc. I am building a search function so that a user can search for a name, but I would like this to ignore all special characters, and treat them as letters from the English 26-letter alphabet instead.

For instance, a query for the name munchen should be matched with the value München. The query converts both sides of the comparison to case-insensitive 26-letter alphabet, and then performs the comparison.

Is there a straightforward way to achieve this?

Upvotes: 0

Views: 698

Answers (1)

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131180

Those characters aren't special in any way. Matching and ordering is affected by a column's collation - this specifies the sorting order and which characters match.

A collation can be case-sensitive or not, which specifies whether Mary is considered equal to mary. It can also be accent sensitive or not, which specifies whether Munchen is equal to München or not.

To match Munchen with München, all that's needed is to use an Accent Insensitive collation for that column, eg :

create table Cities (
    id int identity primary key,
    Name nvarchar(200) COLLATE Latin1_General_CI_AI ,
    INDEX IX_Cities_NAME (Name)
)

insert into Cities (Name)
values ('London'), ('München'), ('Munchen')

select * 
from Cities 
where Name = N'mUnchen'

The result is :

Name
--------
München
Munchen

I didn't make Name a primary key because I wouldn't be able to insert both München and Munchen, I'd get a primary key violation

Upvotes: 3

Related Questions