Reputation:
Is it possible to reduce executed querys in any way ? because the way i do that for now is OK but later i can end up with 30 querys and this do not look OK to me
my script
$string = 'new movie stars';
$words = preg_split('/(\/|\s+)/', $string);
print_r($words);
Array ( [0] => new [1] => movie [2] => stars )
$sql = "SELECT * FROM movie WHERE MATCH(name) AGAINST('+$words[0] +$words[1] +$words[2]' IN BOOLEAN MODE)";
$query_name = $this->db->query($sql);
if ($query_name->num_rows < 20) {
$sql = "SELECT * FROM movie WHERE MATCH(name) AGAINST('+$words[0] +($words[1] $words[2])' IN BOOLEAN MODE)";
$query_name_two = $this->db->query($sql);
}
if (count($query_name->num_rows + $query_name_two->num_rows) < 20) {
$sql = "SELECT * FROM movie WHERE MATCH(name) AGAINST('$words[0] $words[1] $words[2]' IN BOOLEAN MODE)";
$query_name_three = $this->db->query($sql);
}
Upvotes: 0
Views: 578
Reputation: 28834
Your code is open to SQL injection related attacks. Even real_escape_string
cannot secure it completely. Please learn to use Prepared Statements instead.
Now, besides the above suggestion, there are two further fixes possible:
Fix #1 The php code that you are using to tokenize the input string into words for FTS is insufficient. Some time back, I did create a function to handle this requirement in more robust manner. You may use the following instead:
/**
* Method to take an input string and tokenize it into an array of words for Full Text Searching (FTS).
* This method is used when an input string can be made up of multiple words (let's say, separated by space characters),
* and we need to use different Boolean operators on each of the words. The tokenizing process is similar to extraction
* of words by FTS parser in MySQL. The operators used for matching in Boolean condition are removed from the input $phrase.
* These characters as of latest version of MySQL (8+) are: +-><()~*:""&|
* We can also execute the following query to get updated list: show variables like 'ft_boolean_syntax';
* Afterwards, the modified string is split into individual words considering either space, comma, and, period (.) characters.
* Details at: https://dev.mysql.com/doc/refman/8.0/en/fulltext-natural-language.html
* @param string $phrase Input statement/phrase consisting of words
* @return array Tokenized words
* @author Madhur, 2019
*/
function tokenizeStringIntoFTSWords(string $phrase) : array {
$phrase_mod = trim(preg_replace('/[><()~*:"&|+-]/', '', trim($phrase)));
return preg_split('/[\s,.]/', $phrase_mod, null, PREG_SPLIT_NO_EMPTY);
}
Fix #2 It seems that you are trying to rank the searches, by giving priority in following order:
All words in the text >
First word AND Any of the remaining two words >
Atleast any of the three words.
But, if you read the Full Text Search Documentation, you can do the sorting by relevance using MATCH()
, as it also returns the Relevance score.
When
MATCH()
is used in aWHERE
clause, the rows returned are automatically sorted with the highest relevance first (Unfortunately, this works only in NATURAL mode, not BOOLEAN mode). Relevance values are nonnegative floating-point numbers. Zero relevance means no similarity. Relevance is computed based on the number of words in the row (document), the number of unique words in the row, the total number of words in the collection, and the number of rows that contain a particular word.
So basically, All words in the text has already higher relevance than Atleast any of the three words. Now, if you need to give higher priority to the first word, you simply need to use >
operator on the first word. So, all you need is just the following single query:
SELECT * FROM movie
WHERE
MATCH(name)
AGAINST('>:first_word :second_word :third_word ..and so on)' IN BOOLEAN MODE)
ORDER BY
MATCH(name)
AGAINST('>:first_word :second_word :third_word ..and so on)' IN BOOLEAN MODE)
DESC
LIMIT 20
Upvotes: 0