Reputation: 30751
Table 1: Tracks
Table 2: Wordlist
Table 3: N:M Track has Words (trackwords)
Find all tracks which have all the words.
currently the query looks like:
SELECT DISTINCT t.id FROM track as t
Left Join trackwords as tw ON t.id=tw.trackid
Left Join wordlist as wl on wl.id=tw.wordid
WHERE
wl.trackusecount>0
group by t.id
HAVING SUM(IF(wl.word IN ('folsom','prison','blues'),1,0)) = 3;
Which according to EXPLAIN is using all indexes neccessary:
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
| 1 | SIMPLE | t | index | PRIMARY | PRIMARY | 4 | NULL | 8194507 | Using index |
| 1 | SIMPLE | tw | ref | wordid,trackid | trackid | 4 | mbdb.t.id | 3 | Using where |
| 1 | SIMPLE | wl | eq_ref | PRIMARY,trackusecount | PRIMARY | 4 | mbdb.tw.wordid | 1 | Using where |
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
But the query takes ages. Any suggestion to speedup the query?
Upvotes: 0
Views: 1609
Reputation: 2082
Your problem set is very much like that of storing tags for an item like StackOverflow or Del.icio.us does.
The article Tags: Database schemas proposes several solutions, among them @ChssPly76's idea.
Upvotes: 3
Reputation: 1300
Would probably be faster if you broke this up into two queries. First, a join of the words and trackwords to net you all the trackids you need. Then go back to the track table and do:
WHERE t.id IN(...trackids here...)
but based on the query above all you're returning is t.id which you have from tw.trackid already.
Upvotes: 0
Reputation: 100736
There's no point in left joins if you're only looking for tracks that have all the words. I'm assuming that (trackid
,wordid
) combination is unique in trackwords
.
SELECT t.id
FROM track as t, trackwords as tw, wordlist as wl
WHERE t.id=tw.trackid
AND wl.id=tw.wordid
AND wl.trackusecount>0 /* not sure what that is - you have it in your query */
AND wl.word in ('folsom','prison','blues')
GROUP by t.id
HAVING count(*) = 3
This query would benefit from indexes on wordlist(word), trackwords(trackid,wordid) and track(id).
Upvotes: 5