Buttle Butkus
Buttle Butkus

Reputation: 9456

MySQL how to find parent with exact set of children?

MySQL 5.5

parent table:
id | facts
child table:
parent_id | foreign_key | facts

Now, I want to find parents that have a certain exact set of children, no more, no less. Something like:

SELECT t1.`id` 
from `parent_table` t1 
  LEFT JOIN `child_table` t2 ON t1.id=t2.parent_id
WHERE t2.`fk` = 1 
  AND t2.`fk` = 3  
  AND t2.`fk` = 5 
  AND t2.`fk` = 7 
  AND t2.`fk` = 9

But this will also get a parent record with this set of children: 1,2,3,5,7,9. And I only want those parents that have the exact set of children: 1,3,5,7,9.

Is there a way?

EDIT: child.parent_id and child.fk are both not unique. child.fk is a foreign key linking to another table. ("many-to-many relationship") So it is quite possible for a parent to have children 1,2,3,5,7,9. My whole reason for doing this query is to try to avoid creating a new parent for 1,3,5,7,9 if such a parent already exists.

Upvotes: 4

Views: 2476

Answers (5)

OutstandingBill
OutstandingBill

Reputation: 2844

I just had to solve a more general case of this problem, but in SQL server. The principles are probably similar though.

SetX
  |-- Child1
  |-- Child2
  |-- Child4

SetY
  |-- Child1
  |-- Child3

ParentA -- has the children defined by SetX
  |-- Child1
  |-- Child2
  |-- Child4

ParentB -- has the children defined by SetY
  |-- Child1
  |-- Child3

ParentC -- does not match any of the sets
  |-- Child1
  |-- Child2
  |-- Child3
  |-- Child4

M problem was around the users of a system (the parents), which roles they were assigned to within the system (the children), and which job description would fit the user (the sets).

The way I solved it was using a bit mask. Each child is assigned a unique 2^n bitmask. Membership of the set is then simply that the user's bitmask sum equals the bitmask sum of the set.

When there are lots of children and the bitmask is in danger of overflowing, you can use bigint bitmasks or multiple bitmasks (making sure to set the lower-order bitmasks to zero).

Here's an example written in T-SQL - pretty sure it would be simple to translate into MySQL (and I'm happy if someone wants to do that in their own answer).

declare @users table (
    name varchar(10)
)

declare @skills table (
    name varchar(20)
    , id int identity (0, 1)
    , bitmask bigint
)

declare @usersWithSkills table (
    userName varchar(10)
    , skillName varchar(20)
)

declare @groups table (
    name varchar(20)
    , bitmask bigint
)

declare @skillsInGroups table (
    groupName varchar(10)
    , skillName varchar(20)
)

insert  @users (name)
values  ('Pat')
    , ('Oprah')
    , ('Millie')
    , ('Bert')

insert  @skills (name)
values  ('Latin')
    , ('Icelandic')
    , ('Physics')

insert  @groups (name)
values  ('polyglot')
    , ('modern')
    , ('omniscient')

insert  @skillsInGroups (groupName, skillName)
values  ('polyglot', 'Latin')
    , ('polyglot', 'Icelandic')
    , ('modern', 'Physics')
    , ('modern', 'Icelandic')
    , ('omniscient', 'Latin')
    , ('omniscient', 'Icelandic')
    , ('omniscient', 'Physics')

insert  @usersWithSkills (userName, skillName)
values ('Pat', 'Latin')
    , ('Pat', 'Icelandic')
    , ('Oprah', 'Latin')
    , ('Oprah', 'Icelandic')
    , ('Oprah', 'Physics')
    , ('Millie', 'Icelandic')
    , ('Millie', 'Physics')
    , ('Bert', 'Latin')

-- give each skill a bitmask value
update  @skills
set bitmask = power(2, id)

-- set the total bitmask values for each group
update  g1
set g1.bitmask = t.sum_ind
from    @groups g1
    inner join (
        select  g.name, sum_ind = sum(r.bitmask)
        from    @groups g
            inner join @skillsInGroups rg
                on rg.groupName = g.name
            inner join @skills r
                on r.name = rg.skillName
        group   by g.name
    ) t
        on t.name = g1.name

select  u1.userName, groupName = g.name
from    (
        select  userName = u.name
            , bitmask_total = sum(r.bitmask)
        from    @users u
            inner join @usersWithSkills uir
                on uir.userName = u.name
            inner join @skills r
                on r.name = uir.skillName
        group   by u.name
    ) u1
    left join @groups g
        on g.bitmask = u1.bitmask_total

The results I get from this are

userName   groupName
---------- --------------------
Bert       NULL
Millie     modern
Oprah      omniscient
Pat        polyglot

(4 rows affected)

Upvotes: 0

Joachim Isaksson
Joachim Isaksson

Reputation: 180887

Similar to eggyal's solution, but just thought I'd throw it in as an alternative since it should be more portable across RDBMS's;

SELECT c.parent_id
FROM child_table c
GROUP BY c.parent_id
HAVING SUM(CASE WHEN c.id IN (1,3,5,7,9) THEN 1 ELSE -1 END) = 5

5 being the exact count of children in the IN clause you want matched (in this case all)

This will only work with distinct children, if there are duplicates, it will break.

An SQLfiddle to test with.

Upvotes: 1

eggyal
eggyal

Reputation: 125835

SELECT   parent_id
FROM     child_table
GROUP BY parent_id
HAVING   SUM(id IN (1,3,5,7,9)) = COUNT(*)
     AND COUNT(DISTINCT id) = 5

Upvotes: 1

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115520

This problem is called (exact) relational division. There is a lot of useful code and explanation in this article: Divided We Stand: The SQL of Relational Division.

One way to solve it:

SELECT p.id AS parent_id
FROM parent AS p
WHERE EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk = 1 AND c.parent_id = p.id)
  AND EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk = 3 AND c.parent_id = p.id)
  AND EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk = 5 AND c.parent_id = p.id)
  AND EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk = 7 AND c.parent_id = p.id)
  AND EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk = 9 AND c.parent_id = p.id)
  AND NOT EXISTS
      ( SELECT * FROM child AS c
        WHERE c.fk NOT IN (1,3,5,7,9) AND c.parent_id = p.id) ;

And another link to a similar question, here at StackOverflow, where you'll find more than 10 different solutions (note: it's not for the exact division but for the division with remainder) and performance tests (for Postgres): How to filter SQL results in a has-many-through relation

Upvotes: 1

John Woo
John Woo

Reputation: 263693

Assuming that child.id is unique for every child.parent_id.

SELECT  a.id, a.facts
FROM    parent a
        INNER JOIN child b
            ON a.id = b.parent_ID
WHERE   b.id IN (1,3,5,7,9) AND        -- <<== list all ChildID here
        EXISTS                         -- <<== this part checks if the parent_ID
        (                              --           present on the EXISTS clause
            SELECT  parent_ID          --           which only filters parents
            FROM    child c            --           with 5 children
            WHERE   b.parent_ID = c.parent_ID
            GROUP   BY parent_ID
            HAVING  COUNT(*) = 5       -- <<== total number of children
        )
GROUP   BY a.id, a.facts
HAVING  COUNT(*) = 5                   -- <<== total number of children

Upvotes: 5

Related Questions