Reputation: 11258
I have this query, which returns a number of rows:
--
-- documents -> document_tags -> tag_groups -> user_groups
--
SELECT documents.id, user_groups.user_id
FROM documents
JOIN document_tags ON documents.id = document_tags.document_id
JOIN tag_groups ON document_tags.tag_id = tag_groups.tag_id
JOIN user_groups ON tag_groups.group_id = user_groups.group_id
WHERE
documents.id = 314
The fact that a number of rows are returned, tells me that the document
with ID 314 is associated with one or more document_tags
, which are in turn related to one or more tag_groups
, which are in turn associated with one or more user_groups
. The query returns these associated users (in the 2nd column of the SELECT output).
Now I want to restructure the query - to show me the orphaned documents... that are NOT associated with ANY users whatsoever ... so I use NOT EXISTS
like this:
SELECT documents.id
FROM documents
JOIN document_tags ON documents.id = document_tags.document_id
JOIN tag_groups ON document_tags.tag_id = tag_groups.tag_id
WHERE
NOT EXISTS (
SELECT 1 FROM user_groups
WHERE user_groups.group_id = tag_groups.group_id
)
If you compare the two queries, you'll see that the first two joins are identical - and the third join has moved into the NOT EXISTS.
I would not expect this second query to return document 314 - because the first query returns users that ARE associated with it. And yet, this second query returns document id 314... as an orphaned document.
Why?
I am probably missing something obvious - but I can't see what. Help?
Upvotes: 2
Views: 61
Reputation: 9994
That document 314 is returned from the second query means that document 314 has some tags that are in some tag groups that don't have associated user groups.
Document 314 might still have other tags, or the tags it has might be in still other tag groups, that have associated user groups, thus why the first query also returns a non-empty result.
Upvotes: 0
Reputation: 85887
Document 314 is associated (via document_tags) with many tag_groups. Some of those tag_groups have associated user_groups (making the first query succeed), others don't have associated user_groups (making the second query succeed).
That is, your second query doesn't ask "is this document NOT associated with ANY users whatsoever?", it asks "is this document associated with at least one tag_group that doesn't have a corresponding user_group?".
Upvotes: 2