MySQL search query ordered by match relevance

Question

I know basic MySQL querying, but I have no idea how to achieve an accurate and relevant search query.

My table look like this:

   id | kanji
   -------------
   1  | 一子
   2  | 一人子
   3  | 一私人
   4  | 一時
   5  | 一時逃れ

I already have this query:

SELECT * FROM  `definition` WHERE `kanji` LIKE '%一%'

The problem is that I want to order the results from the learnt characters, 一 being a required character for the results of this query.

Say, a user knows those characters: 人,子,時

Then, I want the results to be ordered that way:

   id | kanji
   -------------
   2  | 一人子
   1  | 一子
   4  | 一時
   3  | 一私人
   5  | 一時逃れ

The result which matches the most learnt characters should be first. If possible, I'd like to show results that contain only learnt characters first, then a mix of learnt and unknown characters.

How do I do that?

Paul Draper · Accepted Answer

Per your preference, ordering by number of unmatched characters (increasing), and then number of matched character (decreasing).

SELECT *,
    (kanji LIKE '%人%')
  + (kanji LIKE '%子%')
  + (kanji LIKE '%時%') score
FROM kanji 
ORDER BY CHAR_LENGTH(kanji) - score, score DESC

Or, the relational way to do it is to normalize. Create the table like this:

kanji_characters

kanji_id | index | character
----------------------------
      1  |     0 | 一
      1  |     1 | 子
      2  |     0 | 一
      2  |     1 | 人
      2  |     2 | 子
...

Then

SELECT kanji_id,
  COUNT(*) length,
  SUM(CASE WHEN character IN ('人','子','時') THEN 1 END) score
FROM kanji_characters
WHERE index <> 0
  AND kanji_id IN (SELECT kanji_id FROM kanji_characters WHERE index = 0 AND character = '一')
GROUP BY kanji_id
ORDER BY length - score, score DESC

Though you didn't specify what should be done in the case of duplicate characters. The two solutions above handle that differently.

MySQL search query ordered by match relevance

Answers (2)

Related Questions