Ian Storm Taylor
Ian Storm Taylor

Reputation: 8700

How to query for the "parent" of an arbitrary child relationship in SQL?

I'm looking to write the simplest, most efficient SQL query to retrieve a single team from a given event. But the event might not be directly related to the team, it might be related at any level in the table relationship hierarchy.


Setup

Here's a simplistic representation of what my schema looks like:

enter image description here

A couple things to note:

This seems like a fairly basic setup that most SaaS-type companies would have (eg. Slack or Stripe). Everything is grouped under a "team" which users can then interact with as members.


Problem

Given that setup, I'd like to create a SQL query that solves...

Find the team of an event by id.

I can easily write queries that find the parent team either directly, or indirectly via a specific means. For example...

Find the team of a directly related "team event".

SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}

Or...

Find the team of an indirectly related "collection event".

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}

Webhooks get a more complex, because they can be related in two different ways...

Find the team of an indirectly related "webhook event".

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
WHERE team_webhook_events.id = ${id}
OR collection_webhook_events.id = ${id}

I'm not sure if I should be writing it like that, or by using UNION instead...

SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

...unsure which is better for performance.


As you can see, there are a lot of different ways for a single event to be related to a given team, via all those paths! So when I try to write the "all encompassing" query, it ends up being super complex...

Potential Solution #1

Doing it the UNION way, I think I can just make a bunch of queries, and union them all together, although this seems like it might be wasteful?

SELECT teams.*
FROM teams
JOIN events ON events.team_id = teams.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN apps ON apps.team_id = teams.id
JOIN events ON events.app_id = apps.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN events ON events.collection_id = collections.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN webhooks ON webhooks.team_id = teams.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

UNION

SELECT teams.*
FROM teams
JOIN collections ON collections.team_id = teams.id
JOIN webhooks ON webhooks.collection_id = collections.id
JOIN events ON events.webhook_id = webhooks.id
WHERE events.id = ${id}

Potential Solution #2

Using multiple JOINs I think I might be able to join all of the different combinations of event owners together, with their relationship to teams? Although this also seems like a lot of JOINs...

SELECT teams.*
FROM teams

JOIN apps ON apps.team_id = teams.id
JOIN collections ON collections.team_id = teams.id
JOIN webhooks AS team_webhooks ON team_webhooks.team_id = teams.id
JOIN webhooks AS collection_webhooks ON collection_webhooks.collection_id = collections.id

JOIN events AS app_events ON app_events.app_id = apps.id
JOIN events AS collection_events ON collection_events.collection_id = collections.id
JOIN events AS team_events ON team_events.team_id = teams.id
JOIN events AS collection_webhook_events ON collection_webhook_events.webhook_id = collection_webhooks.id
JOIN events AS team_webhook_events ON team_webhook_events.webhook_id = team_webhooks.id

WHERE app_events.id = ${id}
OR collection_events.id = ${id}
OR team_events.id = ${id}
OR collection_webhook_events.id = ${id}
OR team_webhook_events.id = ${id}

Questions


Edit: I don't think this question is a duplicate!

Yesterday, I asked "How to query for nested relationships in SQL?", which involves a similar data structure (because it's the same one), and was written in a similar way. But, as far as I can tell, that question is actually the inverse of this one.

In the first, I was asking how to...

Find all of the events that are related (directly or indirectly) to a given user by ID. Or put more generically, given a "parent"'s ID, find all of its possible descendants in the tree-like data model.

But in this one, I'm asking how to...

Find the team of an arbitrary event by ID. Or put more generically, given an arbitrary "child"'s ID, which can be attached to the tree-like data model at any level of the hierarchy, find its matching parent. (It's essentially the inverse of the first question.)

And as far as I can tell, they involve different solutions. The first seems to involve performing a series of queries as you traverse through the tree, collecting all of the events. The second (this one), I'm less sure about, but seems to require building up all of the joined models first, and then checking to see if which of them match the specific event.

Please help me to mark it as not a duplicate!

Double edit: Okay, no longer marked as duplicate! Thanks :D

Upvotes: 6

Views: 586

Answers (2)

bazinga012
bazinga012

Reputation: 753

Find the team of an arbitrary event by ID:

Take left joins from events table and see which object returns a team_id use that.

select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id 
from events e 
   left join apps a on e.app_id=a.id 
   left join collections c on c.id=e.collection_id 
   left join webhooks w on w.id=e.webhook_id 
   left join teams t on t.id=e.team_id 
where e.id=${id};

Just for completeness if you want all data for resultant team:

select t.* from teams t 
join 
(select coalesce(e.team_id,a.team_id,c.team_id,w.team_id) eteam_id
from events e left join apps a on e.app_id=a.id 
left join collections c on c.id=e.collection_id 
left join webhooks w on w.id=e.webhook_id 
left join teams t on t.id=e.team_id 
where e.id=1) t1 
on t.id=eteam_id;

Upvotes: 3

joop
joop

Reputation: 4513

Just OR the conditions together.

Since you are only interested in teams, you dont need to join the other tables (that could generate duplicates), only for the existence of records with certain (matching) values in the other tables.


SELECT * 
FROM teams t
WHERE EXISTS (
        SELECT 1
        FROM events e
        WHERE e.SOME_FIELD = SOME_EVENT
        AND
          (     team_id = t.id  -- direct reference
        OR EXISTS (             -- reference via webhooks
                SELECT 1 FROM webhooks w
                WHERE  w.id = e.webhook_id
                AND w.team_id = t.id
                )
        OR EXISTS (             -- reference via apps
                SELECT 1 FROM apps a
                WHERE  a.id = e.app_id
                AND a.team_id = t.id
                )
        OR EXISTS (             -- reference via collections
                SELECT 1 FROM collections c
                WHERE  c.id = e.collection_id
                AND c.team_id = t.id
                )
        OR EXISTS (             -- reference via webhooks*collections
                SELECT 1 FROM webhooks w
                JOIN collections c ON w.collection_id = c.id
                WHERE  w.id = e.webhook_id
                AND c.team_id = t.id
                WHERE c.team_id = t.id
                )
          )
        ) ;

Upvotes: 2

Related Questions