Pim
Pim

Reputation: 598

Finding tags commonly used together

I'm wondering if it's possible to query a table of tags to find the tags that are commonly used together.

I have a mysql database with 2 tables, text and tags. Tags are keywords extracted from the text. The table 'tags' has 3 columns id, tag, and text_id. I have put up a sample on sql fiddle

http://sqlfiddle.com/#!2/20504/1

I like to query this table to find that "joker" and "batman" are commonly used together (although not always).

Appreciate any help!

Upvotes: 0

Views: 94

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271151

You can count the co-occurrence of all pairs by using a self-join:

SELECT t1.tag as tag, t2.tag as tag2, count(*)
FROM tags t1 join
     tags t2
     on t1.text_id = t2.text_id and t1.tag < t2.tag
GROUP BY t1.tag, t2.tag
ORDER BY count(*) desc;

If you just want the pair that co-occurs the most often, then you can add limit 1 to the query.

Upvotes: 2

Related Questions