Reputation: 866
I'm trying to load rows form a posts table based on whether they have multiple rows in another table. Take the below table structures:
posts
post_id post_title
-------------------
1 My Post
2 Another Post
post_tags
post_tag_id post_tag_name
--------------------------
1 My Tag
2 Another Tag
postTags
postTag_id postTag_tag_id postTag_post_id
------------------------------------------
1 1 1
2 2 1
Unsurprisingly, post and post_tags stores the posts and tags, and postTags joins which posts have which tags.
What I'd normally do to join the tables is this:
SELECT * FROM (`posts`)
JOIN `postTags` ON (`postTag_post_id` = `post_id`)
JOIN `post_tags` ON (`post_tag_id` = `postTag_tag_id`)
Then I'd have information on the tags, and can have additional stuff later in the query to search tag names for search terms etc, and then GROUP once I have posts that match the search terms.
What I'm trying to do is only select from posts where a post has both tag 1 AND tag 2, and I can't work out the SQL for it. I think it needs to be done in the actual JOIN rather than having a WHERE clause for it as when I run the join above I'd obviously get two rows back, so I can't have something like
WHERE post_tag_id = 1 AND post_tag_id = 2
as each row will only have one post_tag_id, and I can't check different values for the same column in one row.
What I've tried to do is something like this:
SELECT * FROM (`posts`)
JOIN `postTags` ON (postTag_tag_id = 1 AND postTag_tag_id = 2)
JOIN `post_tags` ON (`post_tag_id` = `postTag_tag_id`)
but this is returning 0 results when I run it; I've put conditions like this in JOINS before for similar things and I'm sure it's close but can't quite work out what to do if this doesn't work.
Am I at least on the right track? Hopefully I'm not missing something obvious.
Thanks.
Upvotes: 0
Views: 141
Reputation: 52107
Assuming you already know tag IDs (1
and 2
), you could do something like this:
SELECT post_id, post_title
FROM posts JOIN postTags ON (postTag_post_id = post_id)
WHERE postTag_tag_id IN (1, 2)
GROUP BY post_id, post_title
HAVING COUNT(DISTINCT postTag_tag_id) = 2
NOTE: DISTINCT is not necessary if there is an alternate key on postTags {postTag_tag_id, postTag_post_id}
, as it should be.
NOTE: If you don't have tag IDs (and just have tag names), you'll need another JOIN (towards the post_tags
table).
BTW, you should seriously consider ditching the surrogate PK in the junction table (postTags.postTag_id
) and just having the natural PK {postTag_tag_id, postTag_post_id}
. InnoDB tables are clustered, and secondary indexes in clustered tables are fatter and slower than in heap-based tables. Also, some queries can benefit from storing posts tagged by the same tag physically close together (or storing tags of the same post close together, if you reverse the PK).
Upvotes: 0
Reputation: 2346
You are trying to ask the postTags row to be at the same time one thing and another.
You either need to do two joins to post_tags and postTags so you get both. Or you can say that the post can have whatever tag between those two and the total amount of tags must equal two (assuming a post cannot related to the same tag more than once).
First approach:
SELECT *
FROM `posts` as p
WHERE p.`post_id` IN (SELECT pt.`postTag_post_id`
FROM `postTags` as pt
WHERE pt.`postTag_tag_id` = 1)
AND p.`post_id` IN (SELECT pt.`postTag_post_id`
FROM `postTags` as pt
WHERE pt.`postTag_tag_id` = 2);
Second approach:
SELECT *
FROM posts as p
WHERE p.post_id IN (SELECT pt.postTag_post_id
FROM (SELECT count(0) as c, pt.postTag_post_id
FROM postTags as pt
WHERE pt.postTag_tag_id IN (1, 2)
GROUP BY pt.postTag_post_id
HAVING c = 2) as pt);
I want also to add that if you use IN or EXISTS in the first approach then you won't have multiple lines for the same post row just because you have more than one tag. This way you save one DISTINCT later that would make your query slower. I've used an IN in the second approach just as a rule of thumb I use: if you don't need to show the data you don't need to do a JOIN in the FROM section.
Upvotes: 2
Reputation: 571
Without actually building a db the same as yours this is hard to verify but it should work.
Let me start by saying that this type of query is much easier and much more performant in a database that supports analytic queries (Oracle, MS SQL Server). So in MySQL you have to do it the old, crappy, aggregate way.
I also want to say that having a table that stores the names of the tags in post_tags and then the mapping of post tags to posts in postTags is confusing. If it were me, I would change the name of the mapping table to post_tags_map or post_tags_to_post_map. So you would have posts with post_id, post_tags with post_tags_id, and post_tags_map with post_tags_map_id. And those id columns would be named the same in every table. Having the same column that is named differently in other tables is also confusing.
Anyways, let's solve your problem. First you want a result set that is 1 post id per row, and only the posts that have tags 1 & 2.
select postTag_post_id, count(1) cnt from (
select postTag_post_id from postTags where postTag_tag_id in (1, 2)
) group by postTag_post_id;`
That should give you back data like this:
postTag_post_id | cnt
1 | 2
Then you can join that result set back to your posts table.
select * from posts p,
(
select postTag_post_id, count(1) cnt from (
select postTag_post_id from postTags where postTag_tag_id in (1, 2)
) group by postTag_post_id;
) t
where p.post_id = t.postTag_post_id
and t.cnt >= 2;
If you need to do another join to the post_tags table in order to get the postTag_tag_id from the post_tag_name, your inner most query would change like so:
select postTag_post_id
from postTags a,
post_tags b
where a.postTag_tag_id = b.post_tag_id
and b.post_tag_name in ('tag 1', 'tag 2');
That should do the trick.
Upvotes: 0
Reputation: 2972
SELECT p.*, t1.*, t2.* FROM posts p
INNER JOIN postTags pt1 ON pt1.postTag_post_id = p.id AND pt1.postTag_tag_id = 1
INNER JOIN postTags pt2 ON pt2.postTag_post_id = p.id AND pt2.postTag_tag_id = 2
INNER JOIN post_tags t1 ON t1.post_tag_id = pt1.postTag_tag_id
INNER JOIN post_tags t2 ON t2.post_tag_id = pt2.postTag_tag_id
Upvotes: 1