Root
Root

Reputation: 2349

MySQL Full Text Search with utf8 (Persian/Arabic)

I have problem with full text search just on UTF8/Unicode Persian/Arabic Language (nothing found from querys).

Below is My search codes:

<?php
mysql_connect("localhost", "user", "password");
mysql_select_db("search");
mysql_query("SET NAMES 'UTF8'"); 

$q = $_GET['q'];

?>
<form action="<?php $_SERVER['PHP_SELF']; ?>">
<input type="text" name="q" value="<?php echo $q; ?>">
<input type="submit" value="Search!">
</form>
<hr>
<?php
if (isset($q)) 
{
    $res = mysql_query("SELECT *, MATCH(name, description) AGAINST ('$q') AS score from search_test WHERE MATCH (name, description) AGAINST('$q') order by score desc");
    $ant = mysql_num_rows($res);
    if ($ant > 0) 
    { // query provided results – display results
        echo ("<br/><h2>Search results for \"$q\":</h2>");
        while ($result = mysql_fetch_array($res)) 
        {
            echo ("<h3>{$result['name']} ({$result['score']})</h3>{$result['description']}<br/><br/>");
        }
    }
    else 
    { // query provided 0 results – display 0 hit message
        echo ("<br/><h2>Nothing Found \"$q\" query</h2>");
    }
}
?>

where is the problem or how can I search with full-text on Unicode language ?

Upvotes: 4

Views: 4554

Answers (2)

Hossein
Hossein

Reputation: 4539

MySQL fulltext search works well for Persian. Just make sure of the following where needed:

  1. COLLATION = utf8_persian_ci & CHARACTER SET = utf8. (Databases, Tables, and Columns).
  2. Index words of 3 letters and more. This is Very Important for Arabic, ft_min_word_len = 3 (see show variables like "ft_%";)
  3. Check the version of MySQL (5.5 or 5.6), and Engine (InnoDb or MyISAM)

Upvotes: 4

Mihai Iorga
Mihai Iorga

Reputation: 39704

Indexed columns must <= 1000 byte encoding.

You cannot do a FULLTEXT search on Persian letters as the have > 1000 byte encoding. As it is stated here.

for example your آزمایشی has the following character encoding bytes map:

Array
(
    [0] => 1570
    [1] => 1586
    [2] => 1605
    [3] => 1575
    [4] => 1740
    [5] => 1588
    [6] => 1740
)

Upvotes: 3

Related Questions