mpen
mpen

Reputation: 283345

How to remove accents in MySQL?

I've just compiled a database of 1 million place names. I'm going to use it in an auto-complete widget to look up cities. A lot of these places have accents... I want to be able to find records when a user types the name without an accent.

In order to do this, I've got a 2nd column with an unaccented copy of the name. Many of these records are still blank, so I want to write a query to fill them in. Is this possible in straight MySQL? If so, how?

Upvotes: 31

Views: 73790

Answers (10)

Tyler
Tyler

Reputation: 510

Here is an accent removal function based on WordPress's remove_accents().

The only differences are:

  • Changing "ß" to "ss".
  • Removing the more complicated replacement logic. Only the $chars array was ported over.
DELIMITER //
CREATE FUNCTION `remove_accents`(`str` TEXT)
    RETURNS TEXT
    DETERMINISTIC
BEGIN
    SET str = REPLACE(str, 'ª', 'a');
    SET str = REPLACE(str, 'º', 'o');
    SET str = REPLACE(str, 'À', 'A');
    SET str = REPLACE(str, 'Á', 'A');
    SET str = REPLACE(str, 'Â', 'A');
    SET str = REPLACE(str, 'Ã', 'A');
    SET str = REPLACE(str, 'Ä', 'A');
    SET str = REPLACE(str, 'Å', 'A');
    SET str = REPLACE(str, 'Æ', 'AE');
    SET str = REPLACE(str, 'Ç', 'C');
    SET str = REPLACE(str, 'È', 'E');
    SET str = REPLACE(str, 'É', 'E');
    SET str = REPLACE(str, 'Ê', 'E');
    SET str = REPLACE(str, 'Ë', 'E');
    SET str = REPLACE(str, 'Ì', 'I');
    SET str = REPLACE(str, 'Í', 'I');
    SET str = REPLACE(str, 'Î', 'I');
    SET str = REPLACE(str, 'Ï', 'I');
    SET str = REPLACE(str, 'Ð', 'D');
    SET str = REPLACE(str, 'Ñ', 'N');
    SET str = REPLACE(str, 'Ò', 'O');
    SET str = REPLACE(str, 'Ó', 'O');
    SET str = REPLACE(str, 'Ô', 'O');
    SET str = REPLACE(str, 'Õ', 'O');
    SET str = REPLACE(str, 'Ö', 'O');
    SET str = REPLACE(str, 'Ù', 'U');
    SET str = REPLACE(str, 'Ú', 'U');
    SET str = REPLACE(str, 'Û', 'U');
    SET str = REPLACE(str, 'Ü', 'U');
    SET str = REPLACE(str, 'Ý', 'Y');
    SET str = REPLACE(str, 'Þ', 'TH');
    SET str = REPLACE(str, 'ß', 'ss');
    SET str = REPLACE(str, 'à', 'a');
    SET str = REPLACE(str, 'á', 'a');
    SET str = REPLACE(str, 'â', 'a');
    SET str = REPLACE(str, 'ã', 'a');
    SET str = REPLACE(str, 'ä', 'a');
    SET str = REPLACE(str, 'å', 'a');
    SET str = REPLACE(str, 'æ', 'ae');
    SET str = REPLACE(str, 'ç', 'c');
    SET str = REPLACE(str, 'è', 'e');
    SET str = REPLACE(str, 'é', 'e');
    SET str = REPLACE(str, 'ê', 'e');
    SET str = REPLACE(str, 'ë', 'e');
    SET str = REPLACE(str, 'ì', 'i');
    SET str = REPLACE(str, 'í', 'i');
    SET str = REPLACE(str, 'î', 'i');
    SET str = REPLACE(str, 'ï', 'i');
    SET str = REPLACE(str, 'ð', 'd');
    SET str = REPLACE(str, 'ñ', 'n');
    SET str = REPLACE(str, 'ò', 'o');
    SET str = REPLACE(str, 'ó', 'o');
    SET str = REPLACE(str, 'ô', 'o');
    SET str = REPLACE(str, 'õ', 'o');
    SET str = REPLACE(str, 'ö', 'o');
    SET str = REPLACE(str, 'ø', 'o');
    SET str = REPLACE(str, 'ù', 'u');
    SET str = REPLACE(str, 'ú', 'u');
    SET str = REPLACE(str, 'û', 'u');
    SET str = REPLACE(str, 'ü', 'u');
    SET str = REPLACE(str, 'ý', 'y');
    SET str = REPLACE(str, 'þ', 'th');
    SET str = REPLACE(str, 'ÿ', 'y');
    SET str = REPLACE(str, 'Ø', 'O');
    SET str = REPLACE(str, 'Ā', 'A');
    SET str = REPLACE(str, 'ā', 'a');
    SET str = REPLACE(str, 'Ă', 'A');
    SET str = REPLACE(str, 'ă', 'a');
    SET str = REPLACE(str, 'Ą', 'A');
    SET str = REPLACE(str, 'ą', 'a');
    SET str = REPLACE(str, 'Ć', 'C');
    SET str = REPLACE(str, 'ć', 'c');
    SET str = REPLACE(str, 'Ĉ', 'C');
    SET str = REPLACE(str, 'ĉ', 'c');
    SET str = REPLACE(str, 'Ċ', 'C');
    SET str = REPLACE(str, 'ċ', 'c');
    SET str = REPLACE(str, 'Č', 'C');
    SET str = REPLACE(str, 'č', 'c');
    SET str = REPLACE(str, 'Ď', 'D');
    SET str = REPLACE(str, 'ď', 'd');
    SET str = REPLACE(str, 'Đ', 'D');
    SET str = REPLACE(str, 'đ', 'd');
    SET str = REPLACE(str, 'Ē', 'E');
    SET str = REPLACE(str, 'ē', 'e');
    SET str = REPLACE(str, 'Ĕ', 'E');
    SET str = REPLACE(str, 'ĕ', 'e');
    SET str = REPLACE(str, 'Ė', 'E');
    SET str = REPLACE(str, 'ė', 'e');
    SET str = REPLACE(str, 'Ę', 'E');
    SET str = REPLACE(str, 'ę', 'e');
    SET str = REPLACE(str, 'Ě', 'E');
    SET str = REPLACE(str, 'ě', 'e');
    SET str = REPLACE(str, 'Ĝ', 'G');
    SET str = REPLACE(str, 'ĝ', 'g');
    SET str = REPLACE(str, 'Ğ', 'G');
    SET str = REPLACE(str, 'ğ', 'g');
    SET str = REPLACE(str, 'Ġ', 'G');
    SET str = REPLACE(str, 'ġ', 'g');
    SET str = REPLACE(str, 'Ģ', 'G');
    SET str = REPLACE(str, 'ģ', 'g');
    SET str = REPLACE(str, 'Ĥ', 'H');
    SET str = REPLACE(str, 'ĥ', 'h');
    SET str = REPLACE(str, 'Ħ', 'H');
    SET str = REPLACE(str, 'ħ', 'h');
    SET str = REPLACE(str, 'Ĩ', 'I');
    SET str = REPLACE(str, 'ĩ', 'i');
    SET str = REPLACE(str, 'Ī', 'I');
    SET str = REPLACE(str, 'ī', 'i');
    SET str = REPLACE(str, 'Ĭ', 'I');
    SET str = REPLACE(str, 'ĭ', 'i');
    SET str = REPLACE(str, 'Į', 'I');
    SET str = REPLACE(str, 'į', 'i');
    SET str = REPLACE(str, 'İ', 'I');
    SET str = REPLACE(str, 'ı', 'i');
    SET str = REPLACE(str, 'IJ', 'IJ');
    SET str = REPLACE(str, 'ij', 'ij');
    SET str = REPLACE(str, 'Ĵ', 'J');
    SET str = REPLACE(str, 'ĵ', 'j');
    SET str = REPLACE(str, 'Ķ', 'K');
    SET str = REPLACE(str, 'ķ', 'k');
    SET str = REPLACE(str, 'ĸ', 'k');
    SET str = REPLACE(str, 'Ĺ', 'L');
    SET str = REPLACE(str, 'ĺ', 'l');
    SET str = REPLACE(str, 'Ļ', 'L');
    SET str = REPLACE(str, 'ļ', 'l');
    SET str = REPLACE(str, 'Ľ', 'L');
    SET str = REPLACE(str, 'ľ', 'l');
    SET str = REPLACE(str, 'Ŀ', 'L');
    SET str = REPLACE(str, 'ŀ', 'l');
    SET str = REPLACE(str, 'Ł', 'L');
    SET str = REPLACE(str, 'ł', 'l');
    SET str = REPLACE(str, 'Ń', 'N');
    SET str = REPLACE(str, 'ń', 'n');
    SET str = REPLACE(str, 'Ņ', 'N');
    SET str = REPLACE(str, 'ņ', 'n');
    SET str = REPLACE(str, 'Ň', 'N');
    SET str = REPLACE(str, 'ň', 'n');
    SET str = REPLACE(str, 'ʼn', 'n');
    SET str = REPLACE(str, 'Ŋ', 'N');
    SET str = REPLACE(str, 'ŋ', 'n');
    SET str = REPLACE(str, 'Ō', 'O');
    SET str = REPLACE(str, 'ō', 'o');
    SET str = REPLACE(str, 'Ŏ', 'O');
    SET str = REPLACE(str, 'ŏ', 'o');
    SET str = REPLACE(str, 'Ő', 'O');
    SET str = REPLACE(str, 'ő', 'o');
    SET str = REPLACE(str, 'Œ', 'OE');
    SET str = REPLACE(str, 'œ', 'oe');
    SET str = REPLACE(str, 'Ŕ', 'R');
    SET str = REPLACE(str, 'ŕ', 'r');
    SET str = REPLACE(str, 'Ŗ', 'R');
    SET str = REPLACE(str, 'ŗ', 'r');
    SET str = REPLACE(str, 'Ř', 'R');
    SET str = REPLACE(str, 'ř', 'r');
    SET str = REPLACE(str, 'Ś', 'S');
    SET str = REPLACE(str, 'ś', 's');
    SET str = REPLACE(str, 'Ŝ', 'S');
    SET str = REPLACE(str, 'ŝ', 's');
    SET str = REPLACE(str, 'Ş', 'S');
    SET str = REPLACE(str, 'ş', 's');
    SET str = REPLACE(str, 'Š', 'S');
    SET str = REPLACE(str, 'š', 's');
    SET str = REPLACE(str, 'Ţ', 'T');
    SET str = REPLACE(str, 'ţ', 't');
    SET str = REPLACE(str, 'Ť', 'T');
    SET str = REPLACE(str, 'ť', 't');
    SET str = REPLACE(str, 'Ŧ', 'T');
    SET str = REPLACE(str, 'ŧ', 't');
    SET str = REPLACE(str, 'Ũ', 'U');
    SET str = REPLACE(str, 'ũ', 'u');
    SET str = REPLACE(str, 'Ū', 'U');
    SET str = REPLACE(str, 'ū', 'u');
    SET str = REPLACE(str, 'Ŭ', 'U');
    SET str = REPLACE(str, 'ŭ', 'u');
    SET str = REPLACE(str, 'Ů', 'U');
    SET str = REPLACE(str, 'ů', 'u');
    SET str = REPLACE(str, 'Ű', 'U');
    SET str = REPLACE(str, 'ű', 'u');
    SET str = REPLACE(str, 'Ų', 'U');
    SET str = REPLACE(str, 'ų', 'u');
    SET str = REPLACE(str, 'Ŵ', 'W');
    SET str = REPLACE(str, 'ŵ', 'w');
    SET str = REPLACE(str, 'Ŷ', 'Y');
    SET str = REPLACE(str, 'ŷ', 'y');
    SET str = REPLACE(str, 'Ÿ', 'Y');
    SET str = REPLACE(str, 'Ź', 'Z');
    SET str = REPLACE(str, 'ź', 'z');
    SET str = REPLACE(str, 'Ż', 'Z');
    SET str = REPLACE(str, 'ż', 'z');
    SET str = REPLACE(str, 'Ž', 'Z');
    SET str = REPLACE(str, 'ž', 'z');
    SET str = REPLACE(str, 'ſ', 's');
    SET str = REPLACE(str, 'Ə', 'E');
    SET str = REPLACE(str, 'ǝ', 'e');
    SET str = REPLACE(str, 'Ș', 'S');
    SET str = REPLACE(str, 'ș', 's');
    SET str = REPLACE(str, 'Ț', 'T');
    SET str = REPLACE(str, 'ț', 't');
    SET str = REPLACE(str, '€', 'E');
    SET str = REPLACE(str, '£', '');
    SET str = REPLACE(str, 'Ơ', 'O');
    SET str = REPLACE(str, 'ơ', 'o');
    SET str = REPLACE(str, 'Ư', 'U');
    SET str = REPLACE(str, 'ư', 'u');
    SET str = REPLACE(str, 'Ầ', 'A');
    SET str = REPLACE(str, 'ầ', 'a');
    SET str = REPLACE(str, 'Ằ', 'A');
    SET str = REPLACE(str, 'ằ', 'a');
    SET str = REPLACE(str, 'Ề', 'E');
    SET str = REPLACE(str, 'ề', 'e');
    SET str = REPLACE(str, 'Ồ', 'O');
    SET str = REPLACE(str, 'ồ', 'o');
    SET str = REPLACE(str, 'Ờ', 'O');
    SET str = REPLACE(str, 'ờ', 'o');
    SET str = REPLACE(str, 'Ừ', 'U');
    SET str = REPLACE(str, 'ừ', 'u');
    SET str = REPLACE(str, 'Ỳ', 'Y');
    SET str = REPLACE(str, 'ỳ', 'y');
    SET str = REPLACE(str, 'Ả', 'A');
    SET str = REPLACE(str, 'ả', 'a');
    SET str = REPLACE(str, 'Ẩ', 'A');
    SET str = REPLACE(str, 'ẩ', 'a');
    SET str = REPLACE(str, 'Ẳ', 'A');
    SET str = REPLACE(str, 'ẳ', 'a');
    SET str = REPLACE(str, 'Ẻ', 'E');
    SET str = REPLACE(str, 'ẻ', 'e');
    SET str = REPLACE(str, 'Ể', 'E');
    SET str = REPLACE(str, 'ể', 'e');
    SET str = REPLACE(str, 'Ỉ', 'I');
    SET str = REPLACE(str, 'ỉ', 'i');
    SET str = REPLACE(str, 'Ỏ', 'O');
    SET str = REPLACE(str, 'ỏ', 'o');
    SET str = REPLACE(str, 'Ổ', 'O');
    SET str = REPLACE(str, 'ổ', 'o');
    SET str = REPLACE(str, 'Ở', 'O');
    SET str = REPLACE(str, 'ở', 'o');
    SET str = REPLACE(str, 'Ủ', 'U');
    SET str = REPLACE(str, 'ủ', 'u');
    SET str = REPLACE(str, 'Ử', 'U');
    SET str = REPLACE(str, 'ử', 'u');
    SET str = REPLACE(str, 'Ỷ', 'Y');
    SET str = REPLACE(str, 'ỷ', 'y');
    SET str = REPLACE(str, 'Ẫ', 'A');
    SET str = REPLACE(str, 'ẫ', 'a');
    SET str = REPLACE(str, 'Ẵ', 'A');
    SET str = REPLACE(str, 'ẵ', 'a');
    SET str = REPLACE(str, 'Ẽ', 'E');
    SET str = REPLACE(str, 'ẽ', 'e');
    SET str = REPLACE(str, 'Ễ', 'E');
    SET str = REPLACE(str, 'ễ', 'e');
    SET str = REPLACE(str, 'Ỗ', 'O');
    SET str = REPLACE(str, 'ỗ', 'o');
    SET str = REPLACE(str, 'Ỡ', 'O');
    SET str = REPLACE(str, 'ỡ', 'o');
    SET str = REPLACE(str, 'Ữ', 'U');
    SET str = REPLACE(str, 'ữ', 'u');
    SET str = REPLACE(str, 'Ỹ', 'Y');
    SET str = REPLACE(str, 'ỹ', 'y');
    SET str = REPLACE(str, 'Ấ', 'A');
    SET str = REPLACE(str, 'ấ', 'a');
    SET str = REPLACE(str, 'Ắ', 'A');
    SET str = REPLACE(str, 'ắ', 'a');
    SET str = REPLACE(str, 'Ế', 'E');
    SET str = REPLACE(str, 'ế', 'e');
    SET str = REPLACE(str, 'Ố', 'O');
    SET str = REPLACE(str, 'ố', 'o');
    SET str = REPLACE(str, 'Ớ', 'O');
    SET str = REPLACE(str, 'ớ', 'o');
    SET str = REPLACE(str, 'Ứ', 'U');
    SET str = REPLACE(str, 'ứ', 'u');
    SET str = REPLACE(str, 'Ạ', 'A');
    SET str = REPLACE(str, 'ạ', 'a');
    SET str = REPLACE(str, 'Ậ', 'A');
    SET str = REPLACE(str, 'ậ', 'a');
    SET str = REPLACE(str, 'Ặ', 'A');
    SET str = REPLACE(str, 'ặ', 'a');
    SET str = REPLACE(str, 'Ẹ', 'E');
    SET str = REPLACE(str, 'ẹ', 'e');
    SET str = REPLACE(str, 'Ệ', 'E');
    SET str = REPLACE(str, 'ệ', 'e');
    SET str = REPLACE(str, 'Ị', 'I');
    SET str = REPLACE(str, 'ị', 'i');
    SET str = REPLACE(str, 'Ọ', 'O');
    SET str = REPLACE(str, 'ọ', 'o');
    SET str = REPLACE(str, 'Ộ', 'O');
    SET str = REPLACE(str, 'ộ', 'o');
    SET str = REPLACE(str, 'Ợ', 'O');
    SET str = REPLACE(str, 'ợ', 'o');
    SET str = REPLACE(str, 'Ụ', 'U');
    SET str = REPLACE(str, 'ụ', 'u');
    SET str = REPLACE(str, 'Ự', 'U');
    SET str = REPLACE(str, 'ự', 'u');
    SET str = REPLACE(str, 'Ỵ', 'Y');
    SET str = REPLACE(str, 'ỵ', 'y');
    SET str = REPLACE(str, 'ɑ', 'a');
    SET str = REPLACE(str, 'Ǖ', 'U');
    SET str = REPLACE(str, 'ǖ', 'u');
    SET str = REPLACE(str, 'Ǘ', 'U');
    SET str = REPLACE(str, 'ǘ', 'u');
    SET str = REPLACE(str, 'Ǎ', 'A');
    SET str = REPLACE(str, 'ǎ', 'a');
    SET str = REPLACE(str, 'Ǐ', 'I');
    SET str = REPLACE(str, 'ǐ', 'i');
    SET str = REPLACE(str, 'Ǒ', 'O');
    SET str = REPLACE(str, 'ǒ', 'o');
    SET str = REPLACE(str, 'Ǔ', 'U');
    SET str = REPLACE(str, 'ǔ', 'u');
    SET str = REPLACE(str, 'Ǚ', 'U');
    SET str = REPLACE(str, 'ǚ', 'u');
    SET str = REPLACE(str, 'Ǜ', 'U');
    SET str = REPLACE(str, 'ǜ', 'u');

    RETURN str;
END
//
DELIMITER ;

Upvotes: 0

Online User
Online User

Reputation: 17818

This is the fastest and most efficient way I found:

UPDATE movies SET `name` = REPLACE(`name`,'Š','S');
UPDATE movies SET `name` = REPLACE(`name`,'š','s');
UPDATE movies SET `name` = REPLACE(`name`,'Ð','Dj');
UPDATE movies SET `name` = REPLACE(`name`,'Ž','Z');
UPDATE movies SET `name` = REPLACE(`name`,'ž','z');
UPDATE movies SET `name` = REPLACE(`name`,'À','A');
UPDATE movies SET `name` = REPLACE(`name`,'Á','A');
UPDATE movies SET `name` = REPLACE(`name`,'Â','A');
UPDATE movies SET `name` = REPLACE(`name`,'Ã','A');
UPDATE movies SET `name` = REPLACE(`name`,'Ä','A');
UPDATE movies SET `name` = REPLACE(`name`,'Å','A');
UPDATE movies SET `name` = REPLACE(`name`,'Æ','A');
UPDATE movies SET `name` = REPLACE(`name`,'Ç','C');
UPDATE movies SET `name` = REPLACE(`name`,'È','E');
UPDATE movies SET `name` = REPLACE(`name`,'É','E');
UPDATE movies SET `name` = REPLACE(`name`,'Ê','E');
UPDATE movies SET `name` = REPLACE(`name`,'Ë','E');
UPDATE movies SET `name` = REPLACE(`name`,'Ì','I');
UPDATE movies SET `name` = REPLACE(`name`,'Í','I');
UPDATE movies SET `name` = REPLACE(`name`,'Î','I');
UPDATE movies SET `name` = REPLACE(`name`,'Ï','I');
UPDATE movies SET `name` = REPLACE(`name`,'Ñ','N');
UPDATE movies SET `name` = REPLACE(`name`,'Ò','O');
UPDATE movies SET `name` = REPLACE(`name`,'Ó','O');
UPDATE movies SET `name` = REPLACE(`name`,'Ô','O');
UPDATE movies SET `name` = REPLACE(`name`,'Õ','O');
UPDATE movies SET `name` = REPLACE(`name`,'Ö','O');
UPDATE movies SET `name` = REPLACE(`name`,'Ø','O');
UPDATE movies SET `name` = REPLACE(`name`,'Ù','U');
UPDATE movies SET `name` = REPLACE(`name`,'Ú','U');
UPDATE movies SET `name` = REPLACE(`name`,'Û','U');
UPDATE movies SET `name` = REPLACE(`name`,'Ü','U');
UPDATE movies SET `name` = REPLACE(`name`,'Ý','Y');
UPDATE movies SET `name` = REPLACE(`name`,'Þ','B');
UPDATE movies SET `name` = REPLACE(`name`,'ß','Ss');
UPDATE movies SET `name` = REPLACE(`name`,'à','a');
UPDATE movies SET `name` = REPLACE(`name`,'á','a');
UPDATE movies SET `name` = REPLACE(`name`,'â','a');
UPDATE movies SET `name` = REPLACE(`name`,'ã','a');
UPDATE movies SET `name` = REPLACE(`name`,'ä','a');
UPDATE movies SET `name` = REPLACE(`name`,'å','a');
UPDATE movies SET `name` = REPLACE(`name`,'æ','a');
UPDATE movies SET `name` = REPLACE(`name`,'ç','c');
UPDATE movies SET `name` = REPLACE(`name`,'è','e');
UPDATE movies SET `name` = REPLACE(`name`,'é','e');
UPDATE movies SET `name` = REPLACE(`name`,'ê','e');
UPDATE movies SET `name` = REPLACE(`name`,'ë','e');
UPDATE movies SET `name` = REPLACE(`name`,'ì','i');
UPDATE movies SET `name` = REPLACE(`name`,'í','i');
UPDATE movies SET `name` = REPLACE(`name`,'î','i');
UPDATE movies SET `name` = REPLACE(`name`,'ï','i');
UPDATE movies SET `name` = REPLACE(`name`,'ð','o');
UPDATE movies SET `name` = REPLACE(`name`,'ñ','n');
UPDATE movies SET `name` = REPLACE(`name`,'ò','o');
UPDATE movies SET `name` = REPLACE(`name`,'ó','o');
UPDATE movies SET `name` = REPLACE(`name`,'ô','o');
UPDATE movies SET `name` = REPLACE(`name`,'õ','o');
UPDATE movies SET `name` = REPLACE(`name`,'ö','o');
UPDATE movies SET `name` = REPLACE(`name`,'ø','o');
UPDATE movies SET `name` = REPLACE(`name`,'ù','u');
UPDATE movies SET `name` = REPLACE(`name`,'ú','u');
UPDATE movies SET `name` = REPLACE(`name`,'û','u');
UPDATE movies SET `name` = REPLACE(`name`,'ý','y');
UPDATE movies SET `name` = REPLACE(`name`,'ý','y');
UPDATE movies SET `name` = REPLACE(`name`,'þ','b');
UPDATE movies SET `name` = REPLACE(`name`,'ÿ','y');
UPDATE movies SET `name` = REPLACE(`name`,'ƒ','f');

Upvotes: -1

Abdel
Abdel

Reputation: 704

I share this maybe can help....:

DELIMITER //
CREATE OR REPLACE FUNCTION `remove_accents`(`str` TEXT)
    RETURNS text
    LANGUAGE SQL
    DETERMINISTIC
    NO SQL
    SQL SECURITY INVOKER
    COMMENT ''

BEGIN

    SET str = REPLACE(str,'Š','S');
    SET str = REPLACE(str,'š','s');
    SET str = REPLACE(str,'Ð','Dj');
    SET str = REPLACE(str,'Ž','Z');
    SET str = REPLACE(str,'ž','z');
    SET str = REPLACE(str,'À','A');
    SET str = REPLACE(str,'Á','A');
    SET str = REPLACE(str,'Â','A');
    SET str = REPLACE(str,'Ã','A');
    SET str = REPLACE(str,'Ä','A');
    SET str = REPLACE(str,'Å','A');
    SET str = REPLACE(str,'Æ','A');
    SET str = REPLACE(str,'Ç','C');
    SET str = REPLACE(str,'È','E');
    SET str = REPLACE(str,'É','E');
    SET str = REPLACE(str,'Ê','E');
    SET str = REPLACE(str,'Ë','E');
    SET str = REPLACE(str,'Ì','I');
    SET str = REPLACE(str,'Í','I');
    SET str = REPLACE(str,'Î','I');
    SET str = REPLACE(str,'Ï','I');
    SET str = REPLACE(str,'Ñ','N');
    SET str = REPLACE(str,'Ò','O');
    SET str = REPLACE(str,'Ó','O');
    SET str = REPLACE(str,'Ô','O');
    SET str = REPLACE(str,'Õ','O');
    SET str = REPLACE(str,'Ö','O');
    SET str = REPLACE(str,'Ø','O');
    SET str = REPLACE(str,'Ù','U');
    SET str = REPLACE(str,'Ú','U');
    SET str = REPLACE(str,'Û','U');
    SET str = REPLACE(str,'Ü','U');
    SET str = REPLACE(str,'Ý','Y');
    SET str = REPLACE(str,'Þ','B');
    SET str = REPLACE(str,'ß','Ss');
    SET str = REPLACE(str,'à','a');
    SET str = REPLACE(str,'á','a');
    SET str = REPLACE(str,'â','a');
    SET str = REPLACE(str,'ã','a');
    SET str = REPLACE(str,'ä','a');
    SET str = REPLACE(str,'å','a');
    SET str = REPLACE(str,'æ','a');
    SET str = REPLACE(str,'ç','c');
    SET str = REPLACE(str,'è','e');
    SET str = REPLACE(str,'é','e');
    SET str = REPLACE(str,'ê','e');
    SET str = REPLACE(str,'ë','e');
    SET str = REPLACE(str,'ì','i');
    SET str = REPLACE(str,'í','i');
    SET str = REPLACE(str,'î','i');
    SET str = REPLACE(str,'ï','i');
    SET str = REPLACE(str,'ð','o');
    SET str = REPLACE(str,'ñ','n');
    SET str = REPLACE(str,'ò','o');
    SET str = REPLACE(str,'ó','o');
    SET str = REPLACE(str,'ô','o');
    SET str = REPLACE(str,'õ','o');
    SET str = REPLACE(str,'ö','o');
    SET str = REPLACE(str,'ø','o');
    SET str = REPLACE(str,'ù','u');
    SET str = REPLACE(str,'ú','u');
    SET str = REPLACE(str,'û','u');
    SET str = REPLACE(str,'ý','y');
    SET str = REPLACE(str,'ý','y');
    SET str = REPLACE(str,'þ','b');
    SET str = REPLACE(str,'ÿ','y');
    SET str = REPLACE(str,'ƒ','f');


    RETURN str;
END
//
DELIMITER ;

Upvotes: 17

jfx
jfx

Reputation: 383

The solution of user3245067 is great, but it uses session variables. This is the solution with local variables and also with DETERMINISTIC keyword for mysql 5.5. Tested with german umlaute in mysql-5.5.

drop function if exists fn_remove_accents;
delimiter $$
create function fn_remove_accents( param_textvalue varchar(20000) )
returns varchar(20000) DETERMINISTIC
begin

DECLARE var_textvalue VARCHAR(20000);
DECLARE var_withaccents VARCHAR(50);
DECLARE var_withoutaccents VARCHAR(50);
DECLARE var_count INT;
DECLARE var_special INT;

set var_textvalue = param_textvalue;

-- ACCENTS
set var_withaccents = 'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ';
set var_withoutaccents = 'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf';
set var_count = length(var_withaccents);

while var_count > 0 do
    set var_textvalue = replace(var_textvalue, substring(var_withaccents, var_count, 1), substring(var_withoutaccents, var_count, 1));
    set var_count = var_count - 1;
end while;

-- SPECIAL CHARS
set var_special = '!var_#$%¨&*()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+*|\\''';
set var_count = length(var_special);
while var_count > 0 do
    set var_textvalue = replace(var_textvalue, substring(var_special, var_count, 1), '');
    set var_count = var_count - 1;
end while;

return var_textvalue;

end$$

delimiter ;

Upvotes: 2

igasparetto
igasparetto

Reputation: 1146

I had the same problem, so I wrote a list of querys based on a PHP script I have to remove accents and make SEO friendly URLs:

Maybe you would like to add other special characters, such as the $ or £ symbols...

UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Š','S');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'š','s');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ð','Dj');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ž','Z');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ž','z');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'À','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Á','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Â','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ã','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ä','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Å','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Æ','A');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ç','C');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'È','E');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'É','E');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ê','E');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ë','E');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ì','I');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Í','I');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Î','I');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ï','I');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ñ','N');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ò','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ó','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ô','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Õ','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ö','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ø','O');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ù','U');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ú','U');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Û','U');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ü','U');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ý','Y');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Þ','B');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ß','Ss');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'à','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'á','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'â','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ã','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ä','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'å','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'æ','a');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ç','c');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'è','e');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'é','e');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ê','e');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ë','e');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ì','i');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'í','i');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'î','i');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ï','i');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ð','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ñ','n');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ò','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ó','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ô','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'õ','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ö','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ø','o');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ù','u');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ú','u');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'û','u');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ý','y');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ý','y');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'þ','b');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ÿ','y');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ƒ','f');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'.',' ');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,' ','-');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'--','-');

UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ě','e');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ž','z');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'š','s');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'č','c');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ř','r');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ď','d');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ť','t');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ň','n');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'ů','u');

UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ě','E');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ž','Z');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Š','S');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Č','C');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ř','R');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ď','D');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ť','T');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ň','N');
UPDATE TABLE_NAME SET COLUMN = REPLACE(COLUMN,'Ů','U');

UPDATE TABLE_NAME SET COLUMN = LOWER(COLUMN);

Upvotes: 27

Erwan
Erwan

Reputation: 2632

Here is an easy solution with a single query :

UPDATE `my_table` SET alias = lower(name),
alias = replace(alias,'Š','S'),
alias = replace(alias,'š','s'),
alias = replace(alias,'Ð','Dj'),
alias = replace(alias,'Ž','Z'),
alias = replace(alias,'ž','z'),
alias = replace(alias,'À','A'),
alias = replace(alias,'Á','A'),
alias = replace(alias,'Â','A'),
alias = replace(alias,'Ã','A'),
alias = replace(alias,'Ä','A'),
alias = replace(alias,'Å','A'),
alias = replace(alias,'Æ','A'),
alias = replace(alias,'Ç','C'),
alias = replace(alias,'È','E'),
alias = replace(alias,'É','E'),
alias = replace(alias,'Ê','E'),
alias = replace(alias,'Ë','E'),
alias = replace(alias,'Ì','I'),
alias = replace(alias,'Í','I'),
alias = replace(alias,'Î','I'),
alias = replace(alias,'Ï','I'),
alias = replace(alias,'Ñ','N'),
alias = replace(alias,'Ò','O'),
alias = replace(alias,'Ó','O'),
alias = replace(alias,'Ô','O'),
alias = replace(alias,'Õ','O'),
alias = replace(alias,'Ö','O'),
alias = replace(alias,'Ø','O'),
alias = replace(alias,'Ù','U'),
alias = replace(alias,'Ú','U'),
alias = replace(alias,'Û','U'),
alias = replace(alias,'Ü','U'),
alias = replace(alias,'Ý','Y'),  
alias = replace(alias,'š','s'),
alias = replace(alias,'Ð','Dj')
alias = replace(alias,'ž','z'),
alias = replace(alias,'Þ','B'),
alias = replace(alias,'ß','Ss'),
alias = replace(alias,'à','a'),
alias = replace(alias,'á','a'),
alias = replace(alias,'â','a'),
alias = replace(alias,'ã','a'),
alias = replace(alias,'ä','a'),
alias = replace(alias,'å','a'),
alias = replace(alias,'æ','a'),
alias = replace(alias,'ç','c'),
alias = replace(alias,'è','e'),
alias = replace(alias,'é','e'),
alias = replace(alias,'ê','e'),
alias = replace(alias,'ë','e'),
alias = replace(alias,'ì','i'),
alias = replace(alias,'í','i'),
alias = replace(alias,'î','i'),
alias = replace(alias,'ï','i'),
alias = replace(alias,'ð','o'),
alias = replace(alias,'ñ','n'),
alias = replace(alias,'ò','o'),
alias = replace(alias,'ó','o'),
alias = replace(alias,'ô','o'),
alias = replace(alias,'õ','o'),
alias = replace(alias,'ö','o'),
alias = replace(alias,'ø','o'),
alias = replace(alias,'ù','u'),
alias = replace(alias,'ú','u'),
alias = replace(alias,'û','u'),
alias = replace(alias,'ý','y'),
alias = replace(alias,'ý','y'),
alias = replace(alias,'þ','b'),
alias = replace(alias,'ÿ','y'),
alias = replace(alias,'ƒ','f'),
alias = replace(alias, 'œ', 'oe'),
alias = trim(alias);

In this example :

  • 'my_table' is the name of the table,
  • 'name' is the original field
  • 'alias' is the new field

Hope it helps !

You can also check this variant if you try to generate a slug field with mysql : Easy way of generating a slug name column from the name column?

Upvotes: 5

Rick James
Rick James

Reputation: 142540

I wonder if MariaDB's REGEXP_REPLACE(col, 'e', 'e') with utf8_unicode_ci would do all the es at once.

Upvotes: 1

user3245067
user3245067

Reputation: 195

drop function if exists fn_remove_accents;
delimiter |
create function fn_remove_accents( textvalue varchar(20000) )
returns varchar(20000)
begin

set @textvalue = textvalue;

-- ACCENTS
set @withaccents = 'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ';
set @withoutaccents = 'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf';
set @count = length(@withaccents);

while @count > 0 do
    set @textvalue = replace(@textvalue, substring(@withaccents, @count, 1), substring(@withoutaccents, @count, 1));
    set @count = @count - 1;
end while;

-- SPECIAL CHARS
set @special = '!@#$%¨&*()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+*|\\''';
set @count = length(@special);
while @count > 0 do
    set @textvalue = replace(@textvalue, substring(@special, @count, 1), '');
    set @count = @count - 1;
end while;

return @textvalue;

end
|

Upvotes: 14

Piskvor left the building
Piskvor left the building

Reputation: 92792

Check out the MySQL manual on CONVERT() and CAST(): http://dev.mysql.com/doc/refman/5.0/en/charset-convert.html

CONVERT() provides a way to convert data between different character sets. The syntax is:

CONVERT(expr USING transcoding_name)

Upvotes: -4

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799520

If you set an appropriate collation for the column then the value within the field will compare equal to its unaccented equivalent naturally.

mysql> SET NAMES 'utf8' COLLATE 'utf8_unicode_ci';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT 'é' = 'e';
+------------+
| 'é' = 'e' |
+------------+
|          1 |
+------------+
1 row in set (0.05 sec)

Upvotes: 31

Related Questions