Rufinus
Rufinus

Reputation: 30751

MySQL - Find rows matching all rows from joined table

Table 1: Tracks

Table 2: Wordlist

Table 3: N:M Track has Words (trackwords)

Find all tracks which have all the words.

currently the query looks like:

SELECT DISTINCT t.id FROM track as t
Left Join trackwords as tw ON t.id=tw.trackid
Left Join wordlist as wl on wl.id=tw.wordid
WHERE 
wl.trackusecount>0 
group by t.id
HAVING SUM(IF(wl.word IN ('folsom','prison','blues'),1,0)) = 3;

Which according to EXPLAIN is using all indexes neccessary:

+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
| id | select_type | table | type   | possible_keys         | key     | key_len | ref            | rows    | Extra       |
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+
|  1 | SIMPLE      | t     | index  | PRIMARY               | PRIMARY | 4       | NULL           | 8194507 | Using index | 
|  1 | SIMPLE      | tw    | ref    | wordid,trackid        | trackid | 4       | mbdb.t.id      |       3 | Using where | 
|  1 | SIMPLE      | wl    | eq_ref | PRIMARY,trackusecount | PRIMARY | 4       | mbdb.tw.wordid |       1 | Using where | 
+----+-------------+-------+--------+-----------------------+---------+---------+----------------+---------+-------------+

But the query takes ages. Any suggestion to speedup the query?

Upvotes: 0

Views: 1609

Answers (3)

JamieGaines
JamieGaines

Reputation: 2082

Your problem set is very much like that of storing tags for an item like StackOverflow or Del.icio.us does.

The article Tags: Database schemas proposes several solutions, among them @ChssPly76's idea.

Upvotes: 3

Robert Swisher
Robert Swisher

Reputation: 1300

Would probably be faster if you broke this up into two queries. First, a join of the words and trackwords to net you all the trackids you need. Then go back to the track table and do:

WHERE t.id IN(...trackids here...)

but based on the query above all you're returning is t.id which you have from tw.trackid already.

Upvotes: 0

ChssPly76
ChssPly76

Reputation: 100736

There's no point in left joins if you're only looking for tracks that have all the words. I'm assuming that (trackid,wordid) combination is unique in trackwords.

SELECT t.id
  FROM track as t,  trackwords as tw, wordlist as wl
 WHERE t.id=tw.trackid
   AND wl.id=tw.wordid
   AND wl.trackusecount>0 /* not sure what that is - you have it in your query */
   AND wl.word in ('folsom','prison','blues')
 GROUP by t.id
HAVING count(*) = 3

This query would benefit from indexes on wordlist(word), trackwords(trackid,wordid) and track(id).

Upvotes: 5

Related Questions