Reputation: 2249
Using Postgres I have 3 tables:
CREATE TABLE post (id SERIAL, body TEXT);
CREATE TABLE tag (id SERIAL, name TEXT);
CREATE TABLE post_tag (post_id INT, tag_id INT);
INSERT INTO post(body) values('post 1');
INSERT INTO post(body) values('post 2');
INSERT INTO tag(name) values('a');
INSERT INTO tag(name) values('b');
INSERT INTO post_tag values(1, 1);
INSERT INTO post_tag values(1, 2);
INSERT INTO post_tag values(2, 1);
Thus post 1
has tags a, b
and post 2
has a
as a tag.
The question: How to select all posts that don't have the tag b
, meaning it should select only the post 2
.
This query here is not good because it will select both posts given that post 1
has 2 tags a
& b
:
SELECT post.*
FROM post
JOIN post_tag ON post_tag.post_id = post.id
JOIN tag ON tag.id = post_tag.tag_id
WHERE tag.name != 'b';
This query below works, but is wrong because if there is a tag aaaaaaab
then it will match it too:
SELECT post.id, post.body, string_agg(tag1.name, ', ')
FROM post
JOIN post_tag ON post_tag.post_id = post.id
JOIN tag ON tag.id = post_tag.tag_id
GROUP BY post.id, post.body
HAVING string_agg(tag.name, ', ') not like '%b, %';
I'm looking for a 'correct' and efficient approach to this.
EDIT: The query should also match posts that don't have any tags at all.
Upvotes: 1
Views: 155
Reputation: 121524
You can select posts with aggregated tags using the query:
select p.id, p.body, array_agg(t.name) tags
from post p
left join post_tag pt on pt.post_id = p.id
left join tag t on pt.tag_id = t.id
group by 1, 2;
id | body | tags
----+--------+-------
1 | post 1 | {a,b}
2 | post 2 | {a}
(2 rows)
With adequate modifications you can use the query to filter your data, e.g.
select p.id, p.body
from post p
left join post_tag pt on pt.post_id = p.id
left join tag t on pt.tag_id = t.id
group by 1, 2
having 'b' <> all(array_agg(t.name));
-- or to get also posts without tags:
-- having 'b' <> all(array_agg(t.name)) or array_agg(t.name) = '{null}';
id | body
----+--------
2 | post 2
(1 row)
Upvotes: 2
Reputation: 2249
One solution, using subqueries is this:
SELECT *
FROM post
WHERE post.id IN (
SELECT post_id
FROM post_tag
WHERE post_id != ALL (
SELECT post_id
FROM post_tag
WHERE tag_id = (
SELECT id
FROM tag
WHERE name = 'b'
)
)
);
I'm not sure however how efficient this query is though, it's most definitely not the most readable.
Upvotes: 0