user446923
user446923

Reputation: 553

SQL query construction: checking if query result is subset of another

Hi Guys I have a table relation which works like this (legacy)

A has many B and B has many C; A has many C as well

Now I am having trouble coming up with a SQL which will help me to get all B (Id of B to make it simple) mapped to certain A(by Id) AND any B which has a collection of C that's a subset of Cs of that A.

I have failed to come up with a decent sql specially for the second part and was wondering if I can get any tips / suggestions re how I can do that.

Thanks

EDIT:

Table A

Id     |..
------------
 1     |..




Table B

Id     |..
--------------
2      |..

Table A_B_rel 

A_id  |    B_id
-----------------
1     |     2

C is a strange table. The data of C (single column) is actually just duped in 2 rel table for A and B. so its like this

Table B_C_Table

B_Id| C_Value
-----------------
2   | 'Somevalue' 

Table A_C_Table

A_Id| C_Value
-------------
1   |    'SomeValue'

So I am looking for Bs the C_Values of which are subset of certain A_C_Values.

Upvotes: 1

Views: 3386

Answers (3)

Zane Bien
Zane Bien

Reputation: 23125

Perhaps this is what you're looking for:

SELECT B_id
FROM A_B_rel
WHERE A_id = <A ID>

UNION

SELECT a.B_Id
FROM B_C_Table a
LEFT JOIN A_C_Table b ON a.C_Value = b.C_Value AND b.A_Id = <A ID>
GROUP BY a.B_Id
HAVING COUNT(CASE WHEN b.A_Id IS NULL THEN 1 END) = 0

The first SELECT gets all B's which are mapped to a particular A (<A ID> being the input parameter for the A ID), then we tack onto that result set any additional B's whose entire set of C_Value's are within the subset of the C_Value's of the particular A (again, <A ID> being the input parameter).

Upvotes: 1

Set
Set

Reputation: 391

Sounds like you need to think in terms of double negation, i.e. there should not exist any B_C that does not have a matching A_C (and I'm guessing there should be at least one B_C).

So, try something like

select B.B_id
from Table_B B 
where exists (select 1 from B_C_Table BC
                  where BC.B_id = B.B_id)
  and not exists (select 1 from B_C_Table BC
                  where BC.B_id = B.B_id
                    and not exists(select 1 from B_C_Table AC
                                   join A_B_Rel ABR on AC.A_id = ABR.A_id
                                   where ABR.B_id = B.B_id
                                     and BC.C_Value = AC.C_Value))

Upvotes: 1

Andriy M
Andriy M

Reputation: 77677

Yes, the second part of your problem is a bit tricky. We've got B_C_Table on the one hand, and a subset of A_C_Table where A_ID is a specific ID, on the other.

Now, if we use an outer join, we'll be able to see which rows in B_C_Table have no match in A_C_Table:

SELECT *
FROM B_C_Table bc
  LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = @A_ID

Note that it is important to put the ac.A_ID = @A_ID into the ON clause rather than into WHERE, because in the latter case we would be filtering out non-matching rows of @A_ID, which is not what we want.

The next step (to achieving the final query) would be to group rows by B and count rows. Now, we will calculate both the total number of rows and the number of matching rows.

SELECT
  bc.B_ID,
  COUNT(*)       AS TotalCount,
  COUNT(ac.A_ID) AS MatchCount
FROM B_C_Table bc
  LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = @A_ID
GROUP BY bc.B_ID

As you can see, to count matches, we simply count ac.A_ID values: in case of no match the corresponding column will be NULL and thus not counted. And if indeed some rows in B_C_Table do not match any rows in the subset of A_C_Table, we will see different values of TotalCount and MatchCount.

And that logically leads us towards the final step: comparing those counts. (For, obviously, if we can obtain values, we can also compare them.) But not in the WHERE clause, of course, because aggregate functions aren't allowed in WHERE. It's the HAVING clause that is used to compare values of grouped rows, including aggregated values too. So...

SELECT
  bc.B_ID,
  COUNT(*)       AS TotalCount,
  COUNT(ac.A_ID) AS MatchCount
FROM B_C_Table bc
  LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = @A_ID
GROUP BY bc.B_ID
HAVING COUNT(*) = COUNT(ac.A_ID)

The count values aren't really needed, of course, and when you drop them you will be able to UNION the above query with the one selecting B_ID from A_B_rel:

SELECT B_ID
FROM A_B_rel
WHERE A_ID = @A_ID

UNION

SELECT bc.B_ID
FROM B_C_Table bc
  LEFT JOIN A_C_Table ac ON bc.C_Value = ac.C_Value AND ac.A_ID = @A_ID
GROUP BY bc.B_ID
HAVING COUNT(*) = COUNT(ac.A_ID)

Upvotes: 1

Related Questions