Dennis
Dennis

Reputation:

How to optimize m:n relation query on 3 tables

this is my sql problem - there are 3 tables:

Names         Lists                ListHasNames
Id Name       Id Desc              ListsId  NamesId
=--------     ------------         ----------------
1  Paul       1  Football          1        1
2  Joe        2  Basketball        1        2
3  Jenny      3  Ping Pong         2        1
4  Tina       4  Breakfast Club    2        3
              5  Midnight Club     3        2
                                   3        3
                                   4        1
                                   4        2
                                   4        3
                                   5        1
                                   5        2
                                   5        3
                                   5        4

Which means that Paul (Id=1) and Joe (Id=2) are in the Football team (Lists.Id=1), Paul and Jenny in the Basketball team, etc...

Now I need a SQL statement which returns the Lists.Id of a specific Name combination: In which lists are Paul, Joe and Jenny the only members of that list ? Answer only Lists.Id=4 (Breakfast Club) - but not 5 (Midnight Club) because Tina is in that list, too.

I've tried it with INNER JOINS and SUB QUERIES:

SELECT Q1.Lists_id FROM

(
SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Paul') and
  (T1.Id=T2.Names_ID) and
   ( (
     SELECT count(*) FROM
      listhasnames as Z1
     where (Z1.lists_id = T2.lists_Id)
    ) = 3)

) AS Q1

INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Joe') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q2

ON (Q1.Lists_id=Q2.Lists_id)



INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Jenny') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q3

ON (Q1.Lists_id=Q3.Lists_id)

Looks a little bit complicated, uh? How to optimize that? I need only that Lists.Id in which specific names are in (and only these names and nobody else). Maybe with SELECT IN?

Regards, Dennis

Upvotes: 0

Views: 1189

Answers (4)

Carl Manaster
Carl Manaster

Reputation: 40356

SELECT ListsId
FROM ListHasNames a
WHERE NamesId in (1, 2, 3)
AND NOT EXISTS
(SELECT * from ListHasNames b 
WHERE b.ListsId = a.ListsId 
AND b.NamesId not in (1, 2, 3))
GROUP BY ListsId
HAVING COUNT(*) = 3;

Edit: Corrected thanks to Chris Gow's comment; the subselect is necessary to exclude lists that have other people on them. Edit 2 Corrected the table name thanks to Dennis' comment

Upvotes: 2

Chris Gow
Chris Gow

Reputation: 7934

Using Carl Manaster's solution as a starting point I came up with:

SELECT listsid 
FROM listhasnames 
GROUP BY listsid HAVING COUNT(*) = 3
INTERSECT
SELECT x.listsid 
FROM listhasnames x, names n 
WHERE n.name IN('Paul', 'Joe', 'Jenny') 
AND n.id = x.namesid

Upvotes: 2

ahains
ahains

Reputation: 1912

I was just solving a problem recently that may work well for your case as well. It may be overkill.

I took the approach of creating a list of candidate associations that may be the correct solution, and then using a cursor or queue table to go through the likely correct solutions to do full validation.

In my case this was implemented by doing like

select
ParentId
count(*) as ChildCount
checksum_agg(checksum(child.*) as ChildAggCrc
from parent join child on parent.parentId = child.parentId

Then you can compare the count and aggregate checksum against your lookup data (i.e. your 3 names to check for). If no rows match, you are guaranteed to have no matches. If any row matches you can then go through and do a join of that specific ParentId to validate if there are any discrepancies between the row sets.

Clear as mud? :)

Upvotes: 0

D'Arcy Rittich
D'Arcy Rittich

Reputation: 171491

Updated:

select a.ListsId from
(
    --lists with three names only
    select lhn.ListsId, count(*) as count
    from ListHasNames  lhn
    inner join Names n on lhn.NamesId = n.Id 
    group by lhn.ListsId
    having count(*) = 3
) a
where a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Paul'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Joe'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Jenny'))

Upvotes: 1

Related Questions