peiman F.
peiman F.

Reputation: 1658

search for words with different Writing or spelling in database

im going to write a mail server script with php language and mysql database.i have to put search im emails tools in my programming TODO but there is a problem

there are some words with the same style and different encoding in some language

for example كتابي and کتابی or کبک and كبك these work can be used for each other by the user computer keyboard layout

the كتابي and كبك are with arabic layout but کتابی and کبک are in persian layout

i tried to find and change one language words to other one with str_replace function but this is not very useful because i dont know these type of words in all language over the world

there isnt any standard for these type of words?!

Upvotes: 2

Views: 166

Answers (1)

O. Jones
O. Jones

Reputation: 108839

I am ignorant of Arabic and Farsi so I don't understand the difference between the end-of-word letters ي and ی. The first one, which is from your Arabic example, has a diacritical mark below it, and the second one doesn't.

It's clear, however, that these characters are unicode-encoded. It's not the keyboard specifically that you're dealing with, it's the unicode characters encoded by the keyboard. The Arabic and Farsi interpretations of the letters are not the same as each other.

The first one is 064A: http://www.fileformat.info/info/unicode/char/064a/index.htm

The second one is 06CC: http://www.fileformat.info/info/unicode/char/06cc/index.htm

Doing this on your column

SELECT CONVERT(table.word USING cp1256) 
  FROM table

will put in replacement characters (?) for the Farsi letters (the letters absent from the Arabic code page cp1256), e.g. turning کتابی into ?تاب?. That may help you detect which letters you need to work with.

You are going to need to develop a transliteration scheme, however. It may be a certain amount of work.

Upvotes: 2

Related Questions