Reputation: 8700
I'm looking to write the simplest, most efficient SQL query to retrieve a single team
from a given event
. But the event
might not be directly related to the team
, it might be related at any level in the table relationship hierarchy.
Here's a simplistic representation of what my schema looks like:
A couple things to note:
teams
can have many collections
, apps
, and webhooks
.collections
can also have many webhooks
.webhooks
can belong to either a team
or collection
, but only one.events
can belong to any object, but only one.This seems like a fairly basic setup that most SaaS-type companies would have (eg. Slack or Stripe). Everything is grouped under a "team" which users can then interact with as members.
Given that setup, I'd like to create a SQL query that solves...
Find the team of an event by
id
.
I can easily write queries that find the parent team either directly, or indirectly via a specific means. For example...
Find the team of a directly related "team event".
SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}
Or...
Find the team of an indirectly related "collection event".
SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}
Webhooks get a more complex, because they can be related in two different ways...
Find the team of an indirectly related "webhook event".
SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
WHERE team_webhook_events.id = ${id}
OR collection_webhook_events.id = ${id}
I'm not sure if I should be writing it like that, or by using UNION
instead...
SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}
UNION
SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}
...unsure which is better for performance.
As you can see, there are a lot of different ways for a single event to be related to a given team, via all those paths! So when I try to write the "all encompassing" query, it ends up being super complex...
Doing it the UNION
way, I think I can just make a bunch of queries, and union them all together, although this seems like it might be wasteful?
SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}
UNION
SELECT teams.*
FROM teams
JOIN apps ON apps.team_id = teams.id
JOIN events ON events.app_id = apps.id
WHERE events.id = ${id}
UNION
SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}
UNION
SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}
UNION
SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}
Using multiple JOIN
s I think I might be able to join all of the different combinations of event owners together, with their relationship to teams
? Although this also seems like a lot of JOIN
s...
SELECT teams.*
FROM teams
JOIN apps ON apps.team_id = teams.id
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id
JOIN events AS app_events ON app_events.app_id = apps.id
JOIN events AS collection_events ON collection_events.collection_id = collections.id
JOIN events AS team_events ON team_events.team_id = teams.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id
WHERE app_events.id = ${id}
OR collection_events.id = ${id}
OR team_events.id = ${id}
OR collection_webhook_events.id = ${id}
OR team_webhook_events.id = ${id}
Edit: I don't think this question is a duplicate!
Yesterday, I asked "How to query for nested relationships in SQL?", which involves a similar data structure (because it's the same one), and was written in a similar way. But, as far as I can tell, that question is actually the inverse of this one.
In the first, I was asking how to...
Find all of the events that are related (directly or indirectly) to a given user by ID. Or put more generically, given a "parent"'s ID, find all of its possible descendants in the tree-like data model.
But in this one, I'm asking how to...
Find the team of an arbitrary event by ID. Or put more generically, given an arbitrary "child"'s ID, which can be attached to the tree-like data model at any level of the hierarchy, find its matching parent. (It's essentially the inverse of the first question.)
And as far as I can tell, they involve different solutions. The first seems to involve performing a series of queries as you traverse through the tree, collecting all of the events. The second (this one), I'm less sure about, but seems to require building up all of the joined models first, and then checking to see if which of them match the specific event.
Please help me to mark it as not a duplicate!
Double edit: Okay, no longer marked as duplicate! Thanks :D
Upvotes: 6
Views: 586
Reputation: 753
Find the team of an arbitrary event by ID:
Take left joins from events table and see which object returns a team_id use that.
select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id
from events e
left join apps a on e.app_id=a.id
left join collections c on c.id=e.collection_id
left join webhooks w on w.id=e.webhook_id
left join teams t on t.id=e.team_id
where e.id=${id};
Just for completeness if you want all data for resultant team:
select t.* from teams t
join
(select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id
from events e left join apps a on e.app_id=a.id
left join collections c on c.id=e.collection_id
left join webhooks w on w.id=e.webhook_id
left join teams t on t.id=e.team_id
where e.id=1) t1
on t.id=eteam_id;
Upvotes: 3
Reputation: 4513
Just OR
the conditions together.
Since you are only interested in teams
, you dont need to join the other tables (that could generate duplicates), only for the existence of records with certain (matching) values in the other tables.
SELECT *
FROM teams t
WHERE EXISTS (
SELECT 1
FROM events e
WHERE e.SOME_FIELD = SOME_EVENT
AND
( team_id = t.id -- direct reference
OR EXISTS ( -- reference via webhooks
SELECT 1 FROM webhooks w
WHERE w.id = e.webhook_id
AND w.team_id = t.id
)
OR EXISTS ( -- reference via apps
SELECT 1 FROM apps a
WHERE a.id = e.app_id
AND a.team_id = t.id
)
OR EXISTS ( -- reference via collections
SELECT 1 FROM collections c
WHERE c.id = e.collection_id
AND c.team_id = t.id
)
OR EXISTS ( -- reference via webhooks*collections
SELECT 1 FROM webhooks w
JOIN collections c ON w.collection_id = c.id
WHERE w.id = e.webhook_id
AND c.team_id = t.id
WHERE c.team_id = t.id
)
)
) ;
Upvotes: 2