Reputation: 6302
How can I detect and delete rows with Chinese characters in MySQL?
Upvotes: 7
Views: 8082
Reputation: 541
Here is the Table "Chinese_Test" Contains the Chinese Character on my PhpMyAdmin
Data:
notice my type of Collation is utf8, thus let's take a look at the Chinese Characters in utf8 table. http://www.ansell-uebersetzungen.com/gbuni.html
Notice the Chinese Character is from E4 to E9, hence we use the code
select number
from Chinese_Test
where HEX(contents) REGEXP '^(..)*(E[4-9])';
and here is the result:
Upvotes: 13
Reputation: 449783
I don't have an answer, but to provide you with a starting point: Chinese characters will occupy certain blocks in the UTF-8 character set. Example
You would have to query for rows that contain characters between the first and the last point of that block. I can't think of a way to automate this though (i.e. to query for characters inside a certain range without naming each character explicitly).
Another untested idea that comes to mind is using iconv()
to convert the string to a specifically Chinese encoding, using //IGNORE
, and seeing whether any data is left. If anything is left, the string may contain chinese characters.... although this would probably be disrupted by any numbers inside the string,
It's an interesting problem.
Upvotes: 0
Reputation: 80657
If all the other rows have alphanumeric values try the following:
DELETE FROM tableName WHERE NOT columnToCheck REGEXP '[A-Za-z0-9.,-]';
Do check the results before deletion, using the following:
SELECT * FROM tableName WHERE NOT columnToCheck REGEXP '[A-Za-z0-9.,-]';
Upvotes: 0