Bart
Bart

Reputation: 6814

Counting Descendant Nodes [from MySQL Closure Pattern]

I am using the following data structure to represent a hierarchy of data.

User Table

+----+------+
| id | name |
+----+------+
|  1 | Bob  |
|  2 | Sam  |
|  3 | Joe  |
|  4 | Kirk |
|  5 | Greg |
+----+------+

Relationship Closure Table

+----------+------------+-------+
| ancestor | descendant | depth |
+----------+------------+-------+
|        1 |          1 |     0 |
|        1 |          2 |     1 |
|        1 |          3 |     2 |
|        1 |          4 |     2 |
|        1 |          5 |     3 |
|        2 |          2 |     0 |
|        2 |          3 |     1 |
|        2 |          4 |     1 |
|        2 |          5 |     2 |
|        3 |          3 |     0 |
|        4 |          4 |     0 |
|        4 |          5 |     1 |
|        5 |          5 |     0 |
+----------+------------+-------+

The above data represents the following (in English-ese):

  1. Bob has one son: Sam
  2. Sam has two sons: Joe and Kirk.
  3. Joe has no sons.
  4. Kirk has one son: Greg.

I am getting the sons of a given user from the following SQL:

SELECT u.*
FROM closure AS c
    INNER JOIN `user` AS u ON (u.id = c.descendant)
WHERE c.ancestor = 1 AND c.depth = 1

This works fine. But I would also like to return the number of descendants all the way down the tree. The best I have been able to come up with so far is this:

SELECT 
    u.*,
    (
        SELECT COUNT(id) FROM `user` WHERE id IN (
            SELECT descendant FROM closure 
            WHERE ancestor = c.descendant
        )
    ) AS descendant_count
FROM closure AS c
    INNER JOIN `user` AS u ON (u.id = c.descendant)
WHERE c.ancestor = 1 AND c.depth = 1

Expected output of the above query is:

+----+------+------------------+
| id | name | descendant_count |
+----+------+------------------+
|  2 | Sam  |                3 |
+----+------+------------------+

Question (finally)

Is there a better way to get the total than what I have? All those sub-selects are gross.

Update

I am realizing as I look at this that I may have simplified things too much for this example. I have two sub-selects to do the count because I actually have 3 tables: category; item; category_closure. In my example data, there would obviously be no need for the double nested sub-select. In my actual data there is. Hopefully that makes sense.

Upvotes: 0

Views: 769

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562300

You don't need subqueries. You can get the number of descendants of each child by joining to the closure table again, to find all those nodes whose ancestor is the respective child. Then use GROUP BY so you can get a count per child.

SELECT 
    u.*,
    COUNT(*) AS descendant_count
FROM closure AS c
    INNER JOIN `user` AS u ON (u.id = c.descendant)
    INNER JOIN closure AS d ON (c.descendant = d.ancestor)
WHERE c.ancestor = 1 AND c.depth = 1
GROUP BY c.descendant

Upvotes: 3

Related Questions