Reputation: 4686
Hi I have one big problem with one MYSQL search. My database TABLE looks like this :
+------+----------+------+
| id | keywords | file |
+------+----------+------+
At keywords there are many keywords for each entry seperated with comas. (keyword1,keyword2...).
At PHP array there are listed some keywords (5-10). And my search must get all DB entries which got atleast 3 from those keywords. Its not required to got all of those words! But it can't work and with just one.
Can somebody help me with that query? I don't got good idea how to make it.
Upvotes: 0
Views: 259
Reputation: 880309
If you had this setup:
table files:
+------+-------+
| id | file |
+------+-------+
| 1000 | foo |
| 1001 | bar |
+------+-------+
table keywords:
+----+-------+
| id | word |
+----+-------+
| 9 | lorem |
| 10 | ipsum |
+----+-------+
table filekeywords:
+----+--------+--------+
| id | fileid | wordid |
+----+--------+--------+
| 1 | 1000 | 9 |
| 2 | 1000 | 10 |
| 3 | 1001 | 10 |
+----+--------+--------+
You could find files with keywords lorem
, ipsum
, dolor
like this:
SELECT COUNT(DISTINCT(k.word)), f.*
FROM files f
INNER JOIN filekeywords fk
ON fk.fileid = f.id
INNER JOIN keywords k
ON k.id = fk.wordid
WHERE k.word in ('lorem', 'ipsum', 'dolor')
GROUP BY f.id
HAVING COUNT(DISTINCT(k.word)) >= 3
Upvotes: 0
Reputation: 7991
That's a challenge. The brute force method would be to use a UNION in a subquery with a count.
For example,
select id, file, count(*) from
(select distinct id, file
from file_table
where FIND_IN_SET(keyword1, keywords)
UNION ALL
select distinct id, file
from file_table
where FIND_IN_SET(keyword2, keywords)
UNION ALL
select distinct id, file
from file_table
where FIND_IN_SET(keyword3, keywords)
UNION ALL
select distinct id, file
from file_table
where FIND_IN_SET(keyword4, keywords)
.... MORE UNION ALL ....) as files
group by id, file
having count(*) >= 3
More efficiently, you could have a separate table with keywords and ID, one keyword/ID combo per row. This would eliminate the wildcard search and make the query more efficient.
The next step would be to go to something like ElasticSearch and filter on the score of the result.
Upvotes: 2