Soundz
Soundz

Reputation: 1300

Entry with most matching relations

I'm trying to create little "recommended" functionality based on the posts with the most matching tags.

I got a layout like this:

Posts

id 
---
1  
2  
3
4

post_tags

post_id  | tag_id 
---------+---------
1        | 1
1        | 2
2        | 2
2        | 3
2        | 4
3        | 1
3        | 2
3        | 4
4        | 5

tags

id
----
1
2
3
4
5

So if I would retrieve recommendations for the post with id 1 the list should go

3 (2/2 matches) 2 (1/2 matches) 4 (0/2 matches)

My Query so far looks like this:

 SELECT DISTINCT
    p.id,
    p.title,
    count(*) as cnt
 FROM
    posts p
 INNER JOIN posts_tags pt ON pt.post_id= p.id
 INNER JOIN tags t ON pt.tag_id = t.id
 WHERE
    t.id IN (
        SELECT
            pt.tag_id
        FROM
            posts_tags pt
        WHERE
            pt.post_id = '30213'
    )
 GROUP BY
    t. NAME
 order by count(*) desc
 LIMIT 0, 4

I know DISTINCT isn't working because of the count but I wanted to see just what he counted, so the result looks like this:

 4  Foo 4881
 4  Foo 2560
 11 Bar 2094
 12 Baz 1998

So what happened? It counted the occurences of the tag in general. So appearantly the first associated tag of "Post 1" is 4881 associated and then pulls the first entry that matches... the one with the lowest id.

I see the problem but I can't solve it.

Upvotes: 0

Views: 26

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270081

Your group by makes no sense. You want to aggregate by the post not the tag:

SELECT p.id, p.title, count(*) as cnt
FROM posts p INNER JOIN
     posts_tags pt
     ON pt.post_id = p.id
WHERE pt.tag_id IN (SELECT pt2.tag_id
                    FROM posts_tags pt2
                    WHERE pt2.post_id = 30213
                   )
GROUP BY p.id, p.title
ORDER BY count(*) desc
LIMIT 0, 4;

This will not return 0. If that is important, you need to use a LEFT JOIN instead of WHERE . . . IN . . ..

Also:

  • SELECT DISTINCT is almost never used with GROUP BY. It is hard (but not impossible) to come up with a use-case for it.
  • You don't need the tags table, so I removed it.
  • Don't use single quotes around numbers. I am guessing that post_id is really a number.
  • The fix is in the GROUP BY.

Upvotes: 1

Related Questions