Daenerys
Daenerys

Reputation: 33

SQL Set containing other subsets question

Suppose you have a column X with values {x1, x2...xn}. Now each x contains a set of values {a1...an}.

For example: x1 links to {a1,a3}, x2 links to {a3}, x3 links to {a1, a2, a3}.

I need to create a query that will output for each Xi the number of other X's that have a set contained within the set of Xi i.e. they contain a subset of Xi's set.

Here is a table for the sample input for this example:

table for the sample input for this example.

And here is the expected result:

expected result. As you can see X1 only contains X2. X2 contains no other x's set. And X3 contains X1 and X2

I can't use any procedural extension (so no PLpgsql) or user defined function. It has to be purely a query from standard SQL.

Upvotes: 2

Views: 113

Answers (1)

George Joseph
George Joseph

Reputation: 5932

The following query can do this. In your record set i added a new row called (X4,A10). I am interested if you are able to find other solutions

create table dbo.t(a varchar(10),b varchar(10))

insert into dbo.t values('X1','A1')
insert into dbo.t values('X1','A3')
insert into dbo.t values('X2','A3')
insert into dbo.t values('X3','A1')
insert into dbo.t values('X3','A2')
insert into dbo.t values('X3','A3')
insert into dbo.t values('X4','A10')

-- I have added comments
with cross_prod /*I am building the combination of elements of a with the values of other sets..*/
  as (
  select m.a main_a,n.a other_a,n.b other_b
    from (select distinct a 
            from dbo.t
         ) m
    join dbo.t n
      on m.a<>n.a  
      )
   ,interim_data
   as (
        select  q1.main_a
                ,q1.other_a
                /*Here full_match_flag=0 if all the elements match*/
                ,count(case when q2.main_a is null then 1 end) as full_match_flag 
          from cross_prod q1
        left join (/*This portion gets me the data where the elements match*/
                   select m.a as main_a,n.a as other_a,n.b as other_b
                     from dbo.t m
                     join dbo.t n
                       on m.a<>n.a  
                     where m.b=n.b
                    )q2
            on q1.main_a=q2.main_a
           and q1.other_a=q2.other_a
           and q1.other_b=q2.other_b
        group by q1.main_a
                ,q1.other_a   
       )
select main_a
       ,count(case when full_match_flag=0 then 1 end) as cnt_matches
 from interim_data
group by main_a

+--------+-------------+
| main_a | cnt_matches |
+--------+-------------+
| X1     |           1 |
| X2     |           0 |
| X3     |           2 |
| X4     |           0 |
+--------+-------------+

Upvotes: 2

Related Questions