Alex Plugaru
Alex Plugaru

Reputation: 2249

Exclude tag from post in many to many queries

Using Postgres I have 3 tables:

CREATE TABLE post (id SERIAL, body TEXT);
CREATE TABLE tag (id SERIAL, name TEXT);
CREATE TABLE post_tag (post_id INT, tag_id INT);

INSERT INTO post(body) values('post 1');
INSERT INTO post(body) values('post 2');
INSERT INTO tag(name) values('a');
INSERT INTO tag(name) values('b');
INSERT INTO post_tag values(1, 1);
INSERT INTO post_tag values(1, 2);
INSERT INTO post_tag values(2, 1);

Thus post 1 has tags a, b and post 2 has a as a tag.

The question: How to select all posts that don't have the tag b, meaning it should select only the post 2.

This query here is not good because it will select both posts given that post 1 has 2 tags a & b:

SELECT post.*
FROM post
JOIN post_tag ON post_tag.post_id = post.id
JOIN tag ON tag.id = post_tag.tag_id
WHERE tag.name != 'b';

This query below works, but is wrong because if there is a tag aaaaaaab then it will match it too:

SELECT post.id, post.body, string_agg(tag1.name, ', ')
FROM post
JOIN post_tag ON post_tag.post_id = post.id
JOIN tag ON tag.id = post_tag.tag_id
GROUP BY post.id, post.body
HAVING string_agg(tag.name, ', ') not like '%b, %';

I'm looking for a 'correct' and efficient approach to this.

EDIT: The query should also match posts that don't have any tags at all.

Upvotes: 1

Views: 155

Answers (2)

klin
klin

Reputation: 121524

You can select posts with aggregated tags using the query:

select p.id, p.body, array_agg(t.name) tags
from post p
left join post_tag pt on pt.post_id = p.id
left join tag t on pt.tag_id = t.id
group by 1, 2;

 id |  body  | tags  
----+--------+-------
  1 | post 1 | {a,b}
  2 | post 2 | {a}
(2 rows)    

With adequate modifications you can use the query to filter your data, e.g.

select p.id, p.body
from post p
left join post_tag pt on pt.post_id = p.id
left join tag t on pt.tag_id = t.id
group by 1, 2
having 'b' <> all(array_agg(t.name));
-- or to get also posts without tags:
-- having 'b' <> all(array_agg(t.name)) or array_agg(t.name) = '{null}';

 id |  body  
----+--------
  2 | post 2
(1 row) 

Upvotes: 2

Alex Plugaru
Alex Plugaru

Reputation: 2249

One solution, using subqueries is this:

SELECT *
FROM post
WHERE post.id IN (
  SELECT post_id
  FROM post_tag
  WHERE post_id != ALL (
    SELECT post_id
    FROM post_tag
    WHERE tag_id = (
      SELECT id
      FROM tag
      WHERE name = 'b'
      )
    )
  );

I'm not sure however how efficient this query is though, it's most definitely not the most readable.

Upvotes: 0

Related Questions