Terminal
Terminal

Reputation: 1999

MySQL query for top documents based on associated tags

There is a list of documents. Multiple users can tag document. Now for given tags (more tha, list of documents is required where more than 30% of users have chosen that tag.

mapping:
---------------------------  
user_id | document_id | tag
    1   |     34      | 26
    2   |     34      | 26
    3   |     36      | 25 
    4   |     34      | 27  

A helper table is also there with total tag_count for given document.

counters:
---------------------------  
document_id | tag_count
    34      | 12
    36      | 26

I am able to write query for single tag like

select * from mapping m
join (select document_id,count(*) as req_tag_count
from mapping
group by document_id) as s on s.document_id = m.document_id
join counters c on c.document_id = m.document_id and req_tag_count / c.tag_count > .3
where m.tag = 26

But not able to write query for multiple tags like return documents with both tags A and B fulfilling the above condition of 30%.

Upvotes: 0

Views: 104

Answers (1)

Tomaso Albinoni
Tomaso Albinoni

Reputation: 1013

Maybe this is what you need:

SELECT t.document_id
FROM (SELECT m.document_id
    FROM mapping m
    WHERE m.tag = 26         # Specify the first tag
    GROUP BY m.document_id
    HAVING COUNT(m.document_id) /
    (SELECT count(document_id)
        FROM mapping i
        WHERE i.document_id = m.document_id
        GROUP BY i.document_id) > 0.3
    UNION SELECT n.document_id
    FROM mapping n
    WHERE n.tag = 27         # Specify the second tag
    GROUP BY n.document_id
    HAVING COUNT(n.document_id) /
    (SELECT count(document_id)
        FROM mapping i
        WHERE i.document_id = n.document_id
        GROUP BY i.document_id) > 0.3)
AS t
GROUP BY t.document_id
HAVING COUNT(t.document_id) = 2    # One per tag

It worked when I tested it. You can adapt this for 3 tags as well.

Upvotes: 1

Related Questions