SQL compare multiple columns in different tables

Question

I have two tables. Table one with 3 category columns. Table 2 with 2 category columns. I tried to write a query that compare these category columns and shows me the data which is in one of the three columns of table 1, but not in one of the two columns of table 2.

e.g.

table 1

    ID      product     cat1  cat2   cat3
    10001   product a   A     B      C 
    10001   product a   D     E      NULL
    10001   product a   F     G      H      
    10002   product b   B     C      D 
    ...     ...         ...   ...    ...

table 2

    ID      product     cat1  cat2
    10001   product a   D     E
    10001   product a   D     F
    10001   product a   G     A      
    10002   product b   A     C
    ...     ...         ...   ...    ...

The output of the query should look something like that:

product_id    not_in_cat
10001         B
10001         C
10001         H
10002         B
10002         D

But I don't know how I can realize it. I tried to do this with a subquery and the "NOT IN" command. But this way I got many subqueries, for each combination of t1.cat1 - t2.cat1, t1.cat2 - t2.cat2, and so on. And this way I got only the categories which are not in the same row.

Maybe someone can help me to figure out, which way is the best to achieve the desired result. The query is performed on a MS SQL Server with a openquery to a MySQL and a oracle db server. But I think its more a logic problem.

Gordon Linoff · Accepted Answer

I think you should "normalize" the data and do the equivalent of a full outer join using union all and group by. This looks like:

select id, cat
from (select id, cat1 as cat, 1 as which from table1 union all
      select id, cat2 as cat, 1 as which from table1 union all
      select id, cat3 as cat, 1 as which from table1 union all
      select id, cat1 as cat, 2 as which from table2 union all
      select id, cat2 as cat, 2 as which from table2
     ) ic
where cat is not null
group by id, cat
having min(which) = max(which);

This finds the categories that are in only one of the tables. Note: If you know that there are no duplicates in either table, then having count(*) = 1 also works.

And, if you want to know the table where the category is present, then ask another question. That is a bit different from the question that you did ask.

SQL compare multiple columns in different tables

table 1

table 2

Answers (1)

Related Questions