Reputation: 372
Hlo...
I've been making a spell checker of Punjabi. Everything is working fine except the diacritics of Punjabi Language. Like e
and é
, Punjabi too has diacritics which are ਸ
and ਸ਼
. But the problem is that when i search in the database, it considers the word with ਸ਼
and ਸ
the same. The database is stored with words in utf-8
format. I am using collation utf8_unicode_ci
for the database and the tables as well.
mysql_query("SET charset utf8");
$exists = mysql_query("SELECT COUNT(word) FROM unicode WHERE word = '$str'");
If the count is 0, it says the word is wrong. $str
is the word. When i try to search, it says the word with both ਸ
and ਸ਼
correct. The word with ਸ਼
is correct.
I've tried to change the collation to utf8_bin
with COLLATE utf8_bin
, but it says both the words wrong ਸ
and ਸ਼
. I've even tried utf8_general_ci
and changing the collation of the table and database.
It either says both incorrect, or both correct. But one of them is correct.
My main problem is diacritic sensitive search which doesn't work with utf8_bin
either...
Plzz Help..Thxx in advance....
Upvotes: 1
Views: 452
Reputation: 27247
SELECT COUNT(word) FROM unicode WHERE BINARY word = '$str'
The BINARY
keyword causes mysql to do a direct bit-by-bit comparison.
Upvotes: 2