Reputation: 598
I'm wondering if it's possible to query a table of tags to find the tags that are commonly used together.
I have a mysql database with 2 tables, text and tags. Tags are keywords extracted from the text. The table 'tags' has 3 columns id, tag, and text_id. I have put up a sample on sql fiddle
http://sqlfiddle.com/#!2/20504/1
I like to query this table to find that "joker" and "batman" are commonly used together (although not always).
Appreciate any help!
Upvotes: 0
Views: 94
Reputation: 1271151
You can count the co-occurrence of all pairs by using a self-join:
SELECT t1.tag as tag, t2.tag as tag2, count(*)
FROM tags t1 join
tags t2
on t1.text_id = t2.text_id and t1.tag < t2.tag
GROUP BY t1.tag, t2.tag
ORDER BY count(*) desc;
If you just want the pair that co-occurs the most often, then you can add limit 1
to the query.
Upvotes: 2